Problem Statement¶

Business Context¶

Renewable energy sources play an increasingly important role in the global energy mix, as the effort to reduce the environmental impact of energy production increases.

Out of all the renewable energy alternatives, wind energy is one of the most developed technologies worldwide. The U.S Department of Energy has put together a guide to achieving operational efficiency using predictive maintenance practices.

Predictive maintenance uses sensor information and analysis methods to measure and predict degradation and future component capability. The idea behind predictive maintenance is that failure patterns are predictable and if component failure can be predicted accurately and the component is replaced before it fails, the costs of operation and maintenance will be much lower.

The sensors fitted across different machines involved in the process of energy generation collect data related to various environmental factors (temperature, humidity, wind speed, etc.) and additional features related to various parts of the wind turbine (gearbox, tower, blades, break, etc.).

Objective¶

“ReneWind” is a company working on improving the machinery/processes involved in the production of wind energy using machine learning and has collected data of generator failure of wind turbines using sensors. They have shared a ciphered version of the data, as the data collected through sensors is confidential (the type of data collected varies with companies). Data has 40 predictors, 20000 observations in the training set and 5000 in the test set.

The objective is to build various classification models, tune them, and find the best one that will help identify failures so that the generators could be repaired before failing/breaking to reduce the overall maintenance cost. The nature of predictions made by the classification model will translate as follows:

  • True positives (TP) are failures correctly predicted by the model. These will result in repairing costs.
  • False negatives (FN) are real failures where there is no detection by the model. These will result in replacement costs.
  • False positives (FP) are detections where there is no failure. These will result in inspection costs.

It is given that the cost of repairing a generator is much less than the cost of replacing it, and the cost of inspection is less than the cost of repair.

“1” in the target variables should be considered as “failure” and “0” represents “No failure”.

Data Description¶

The data provided is a transformed version of the original data which was collected using sensors.

  • Train.csv - To be used for training and tuning of models.
  • Test.csv - To be used only for testing the performance of the final best model.

Both the datasets consist of 40 predictor variables and 1 target variable.

Installing and Importing the necessary libraries¶

In [ ]:
# Installing the libraries with the specified version
!pip install --no-deps tensorflow==2.18.0 scikit-learn==1.3.2 matplotlib===3.8.3 seaborn==0.13.2 numpy==1.26.4 pandas==2.2.2 -q --user --no-warn-script-location
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 61.0/61.0 kB 2.2 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 615.5/615.5 MB 1.9 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.8/10.8 MB 38.7 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.6/11.6 MB 29.8 MB/s eta 0:00:00
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 18.0/18.0 MB 22.1 MB/s eta 0:00:00
In [1]:
# Library for data manipulation and analysis.
#reading and manipulating data
import pandas as pd
import numpy as np

#data visualization
import matplotlib.pyplot as plt
import seaborn as sns

#for imputing
from sklearn.impute import SimpleImputer

#metrics
from sklearn import metrics
from sklearn.metrics import f1_score, accuracy_score, recall_score, precision_score, confusion_matrix, ConfusionMatrixDisplay, classification_report

#splitting datasets
from sklearn.model_selection import train_test_split

#neural network items
import time
import keras
from tensorflow.keras import optimizers
import tensorflow as tf
from tensorflow import keras
from keras import backend
from keras.models import Sequential
from keras.layers import Dense, Dropout, BatchNormalization
from tensorflow.keras import regularizers

# To supress warnings
import warnings

warnings.filterwarnings("ignore")
In [2]:
#a global random seed was set to 812 to ensure reproducibility
keras.utils.set_random_seed(812)

tf.config.experimental.enable_op_determinism()

Note:

  • After running the above cell, kindly restart the runtime (for Google Colab) or notebook kernel (for Jupyter Notebook), and run all cells sequentially from the next cell.
  • On executing the above line of code, you might see a warning regarding package dependencies. This error message can be ignored as the above code ensures that all necessary libraries and their dependencies are maintained to successfully execute the code in this notebook.

Loading the Data¶

In [3]:
# uncomment and run the following lines for Google Colab
from google.colab import drive
drive.mount('/content/drive')
Mounted at /content/drive
In [4]:
df = pd.read_csv('/content/drive/MyDrive/renewind/Train.csv')
df_test = pd.read_csv('/content/drive/MyDrive/renewind/Test.csv')

Data Overview¶

In [ ]:
print('df.head for training data')
df.head()
df.head for training data
Out[ ]:
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 ... V32 V33 V34 V35 V36 V37 V38 V39 V40 Target
0 -4.464606 -4.679129 3.101546 0.506130 -0.221083 -2.032511 -2.910870 0.050714 -1.522351 3.761892 ... 3.059700 -1.690440 2.846296 2.235198 6.667486 0.443809 -2.369169 2.950578 -3.480324 0
1 3.365912 3.653381 0.909671 -1.367528 0.332016 2.358938 0.732600 -4.332135 0.565695 -0.101080 ... -1.795474 3.032780 -2.467514 1.894599 -2.297780 -1.731048 5.908837 -0.386345 0.616242 0
2 -3.831843 -5.824444 0.634031 -2.418815 -1.773827 1.016824 -2.098941 -3.173204 -2.081860 5.392621 ... -0.257101 0.803550 4.086219 2.292138 5.360850 0.351993 2.940021 3.839160 -4.309402 0
3 1.618098 1.888342 7.046143 -1.147285 0.083080 -1.529780 0.207309 -2.493629 0.344926 2.118578 ... -3.584425 -2.577474 1.363769 0.622714 5.550100 -1.526796 0.138853 3.101430 -1.277378 0
4 -0.111440 3.872488 -3.758361 -2.982897 3.792714 0.544960 0.205433 4.848994 -1.854920 -6.220023 ... 8.265896 6.629213 -10.068689 1.222987 -3.229763 1.686909 -2.163896 -3.644622 6.510338 0

5 rows × 41 columns

In [ ]:
print('df.head for test data')
df_test.head()
df.head for test data
Out[ ]:
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 ... V32 V33 V34 V35 V36 V37 V38 V39 V40 Target
0 -0.613489 -3.819640 2.202302 1.300420 -1.184929 -4.495964 -1.835817 4.722989 1.206140 -0.341909 ... 2.291204 -5.411388 0.870073 0.574479 4.157191 1.428093 -10.511342 0.454664 -1.448363 0
1 0.389608 -0.512341 0.527053 -2.576776 -1.016766 2.235112 -0.441301 -4.405744 -0.332869 1.966794 ... -2.474936 2.493582 0.315165 2.059288 0.683859 -0.485452 5.128350 1.720744 -1.488235 0
2 -0.874861 -0.640632 4.084202 -1.590454 0.525855 -1.957592 -0.695367 1.347309 -1.732348 0.466500 ... -1.318888 -2.997464 0.459664 0.619774 5.631504 1.323512 -1.752154 1.808302 1.675748 0
3 0.238384 1.458607 4.014528 2.534478 1.196987 -3.117330 -0.924035 0.269493 1.322436 0.702345 ... 3.517918 -3.074085 -0.284220 0.954576 3.029331 -1.367198 -3.412140 0.906000 -2.450889 0
4 5.828225 2.768260 -1.234530 2.809264 -1.641648 -1.406698 0.568643 0.965043 1.918379 -2.774855 ... 1.773841 -1.501573 -2.226702 4.776830 -6.559698 -0.805551 -0.276007 -3.858207 -0.537694 0

5 rows × 41 columns

In [ ]:
print ('df.tail for training data')
df.tail()
df.tail for training data
Out[ ]:
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 ... V32 V33 V34 V35 V36 V37 V38 V39 V40 Target
19995 -2.071318 -1.088279 -0.796174 -3.011720 -2.287540 2.807310 0.481428 0.105171 -0.586599 -2.899398 ... -8.273996 5.745013 0.589014 -0.649988 -3.043174 2.216461 0.608723 0.178193 2.927755 1
19996 2.890264 2.483069 5.643919 0.937053 -1.380870 0.412051 -1.593386 -5.762498 2.150096 0.272302 ... -4.159092 1.181466 -0.742412 5.368979 -0.693028 -1.668971 3.659954 0.819863 -1.987265 0
19997 -3.896979 -3.942407 -0.351364 -2.417462 1.107546 -1.527623 -3.519882 2.054792 -0.233996 -0.357687 ... 7.112162 1.476080 -3.953710 1.855555 5.029209 2.082588 -6.409304 1.477138 -0.874148 0
19998 -3.187322 -10.051662 5.695955 -4.370053 -5.354758 -1.873044 -3.947210 0.679420 -2.389254 5.456756 ... 0.402812 3.163661 3.752095 8.529894 8.450626 0.203958 -7.129918 4.249394 -6.112267 0
19999 -2.686903 1.961187 6.137088 2.600133 2.657241 -4.290882 -2.344267 0.974004 -1.027462 0.497421 ... 6.620811 -1.988786 -1.348901 3.951801 5.449706 -0.455411 -2.202056 1.678229 -1.974413 0

5 rows × 41 columns

In [ ]:
print ('df_test.tail for test data')
df_test.tail()
df_test.tail for test data
Out[ ]:
V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 ... V32 V33 V34 V35 V36 V37 V38 V39 V40 Target
4995 -5.120451 1.634804 1.251259 4.035944 3.291204 -2.932230 -1.328662 1.754066 -2.984586 1.248633 ... 9.979118 0.063438 0.217281 3.036388 2.109323 -0.557433 1.938718 0.512674 -2.694194 0
4996 -5.172498 1.171653 1.579105 1.219922 2.529627 -0.668648 -2.618321 -2.000545 0.633791 -0.578938 ... 4.423900 2.603811 -2.152170 0.917401 2.156586 0.466963 0.470120 2.196756 -2.376515 0
4997 -1.114136 -0.403576 -1.764875 -5.879475 3.571558 3.710802 -2.482952 -0.307614 -0.921945 -2.999141 ... 3.791778 7.481506 -10.061396 -0.387166 1.848509 1.818248 -1.245633 -1.260876 7.474682 0
4998 -1.703241 0.614650 6.220503 -0.104132 0.955916 -3.278706 -1.633855 -0.103936 1.388152 -1.065622 ... -4.100352 -5.949325 0.550372 -1.573640 6.823936 2.139307 -4.036164 3.436051 0.579249 0
4999 -0.603701 0.959550 -0.720995 8.229574 -1.815610 -2.275547 -2.574524 -1.041479 4.129645 -2.731288 ... 2.369776 -1.062408 0.790772 4.951955 -7.440825 -0.069506 -0.918083 -2.291154 -5.362891 0

5 rows × 41 columns

In [ ]:
print('Training data shape')
df.shape
Training data shape
Out[ ]:
(20000, 41)
In [ ]:
print('Test data shape')
df_test.shape
Test data shape
Out[ ]:
(5000, 41)
  • Training data has 20000 rows and 41 columns
  • Test data has 5000 rows and 41 columns
In [ ]:
df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 20000 entries, 0 to 19999
Data columns (total 41 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   V1      19982 non-null  float64
 1   V2      19982 non-null  float64
 2   V3      20000 non-null  float64
 3   V4      20000 non-null  float64
 4   V5      20000 non-null  float64
 5   V6      20000 non-null  float64
 6   V7      20000 non-null  float64
 7   V8      20000 non-null  float64
 8   V9      20000 non-null  float64
 9   V10     20000 non-null  float64
 10  V11     20000 non-null  float64
 11  V12     20000 non-null  float64
 12  V13     20000 non-null  float64
 13  V14     20000 non-null  float64
 14  V15     20000 non-null  float64
 15  V16     20000 non-null  float64
 16  V17     20000 non-null  float64
 17  V18     20000 non-null  float64
 18  V19     20000 non-null  float64
 19  V20     20000 non-null  float64
 20  V21     20000 non-null  float64
 21  V22     20000 non-null  float64
 22  V23     20000 non-null  float64
 23  V24     20000 non-null  float64
 24  V25     20000 non-null  float64
 25  V26     20000 non-null  float64
 26  V27     20000 non-null  float64
 27  V28     20000 non-null  float64
 28  V29     20000 non-null  float64
 29  V30     20000 non-null  float64
 30  V31     20000 non-null  float64
 31  V32     20000 non-null  float64
 32  V33     20000 non-null  float64
 33  V34     20000 non-null  float64
 34  V35     20000 non-null  float64
 35  V36     20000 non-null  float64
 36  V37     20000 non-null  float64
 37  V38     20000 non-null  float64
 38  V39     20000 non-null  float64
 39  V40     20000 non-null  float64
 40  Target  20000 non-null  int64  
dtypes: float64(40), int64(1)
memory usage: 6.3 MB
In [ ]:
df_test.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5000 entries, 0 to 4999
Data columns (total 41 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   V1      4995 non-null   float64
 1   V2      4994 non-null   float64
 2   V3      5000 non-null   float64
 3   V4      5000 non-null   float64
 4   V5      5000 non-null   float64
 5   V6      5000 non-null   float64
 6   V7      5000 non-null   float64
 7   V8      5000 non-null   float64
 8   V9      5000 non-null   float64
 9   V10     5000 non-null   float64
 10  V11     5000 non-null   float64
 11  V12     5000 non-null   float64
 12  V13     5000 non-null   float64
 13  V14     5000 non-null   float64
 14  V15     5000 non-null   float64
 15  V16     5000 non-null   float64
 16  V17     5000 non-null   float64
 17  V18     5000 non-null   float64
 18  V19     5000 non-null   float64
 19  V20     5000 non-null   float64
 20  V21     5000 non-null   float64
 21  V22     5000 non-null   float64
 22  V23     5000 non-null   float64
 23  V24     5000 non-null   float64
 24  V25     5000 non-null   float64
 25  V26     5000 non-null   float64
 26  V27     5000 non-null   float64
 27  V28     5000 non-null   float64
 28  V29     5000 non-null   float64
 29  V30     5000 non-null   float64
 30  V31     5000 non-null   float64
 31  V32     5000 non-null   float64
 32  V33     5000 non-null   float64
 33  V34     5000 non-null   float64
 34  V35     5000 non-null   float64
 35  V36     5000 non-null   float64
 36  V37     5000 non-null   float64
 37  V38     5000 non-null   float64
 38  V39     5000 non-null   float64
 39  V40     5000 non-null   float64
 40  Target  5000 non-null   int64  
dtypes: float64(40), int64(1)
memory usage: 1.6 MB
  • No anamolies found in train and test data
  • both have 40 rows with non null values and 39 columns with float and target variable as int
In [ ]:
# let's check for missing values in the data
df.isnull().sum() ##  check missing entries in the train data
Out[ ]:
0
V1 18
V2 18
V3 0
V4 0
V5 0
V6 0
V7 0
V8 0
V9 0
V10 0
V11 0
V12 0
V13 0
V14 0
V15 0
V16 0
V17 0
V18 0
V19 0
V20 0
V21 0
V22 0
V23 0
V24 0
V25 0
V26 0
V27 0
V28 0
V29 0
V30 0
V31 0
V32 0
V33 0
V34 0
V35 0
V36 0
V37 0
V38 0
V39 0
V40 0
Target 0

In [ ]:
df_test.isnull().sum() ##  check missing entries in the test data
Out[ ]:
0
V1 5
V2 6
V3 0
V4 0
V5 0
V6 0
V7 0
V8 0
V9 0
V10 0
V11 0
V12 0
V13 0
V14 0
V15 0
V16 0
V17 0
V18 0
V19 0
V20 0
V21 0
V22 0
V23 0
V24 0
V25 0
V26 0
V27 0
V28 0
V29 0
V30 0
V31 0
V32 0
V33 0
V34 0
V35 0
V36 0
V37 0
V38 0
V39 0
V40 0
Target 0

In [ ]:
df.isnull().sum() ##  check missing entries in the train data
Out[ ]:
0
V1 18
V2 18
V3 0
V4 0
V5 0
V6 0
V7 0
V8 0
V9 0
V10 0
V11 0
V12 0
V13 0
V14 0
V15 0
V16 0
V17 0
V18 0
V19 0
V20 0
V21 0
V22 0
V23 0
V24 0
V25 0
V26 0
V27 0
V28 0
V29 0
V30 0
V31 0
V32 0
V33 0
V34 0
V35 0
V36 0
V37 0
V38 0
V39 0
V40 0
Target 0

In [ ]:
df_test.isnull().sum()##  check missing entries in the test data
Out[ ]:
0
V1 5
V2 6
V3 0
V4 0
V5 0
V6 0
V7 0
V8 0
V9 0
V10 0
V11 0
V12 0
V13 0
V14 0
V15 0
V16 0
V17 0
V18 0
V19 0
V20 0
V21 0
V22 0
V23 0
V24 0
V25 0
V26 0
V27 0
V28 0
V29 0
V30 0
V31 0
V32 0
V33 0
V34 0
V35 0
V36 0
V37 0
V38 0
V39 0
V40 0
Target 0

  • There are missing values for v1 and v2 in train and test set
In [ ]:
df.Target.value_counts(normalize=False)
Out[ ]:
count
Target
0 18890
1 1110

In [ ]:
df_test.Target.value_counts(normalize=False)
Out[ ]:
count
Target
0 4718
1 282

In [ ]:
df.describe().T
Out[ ]:
count mean std min 25% 50% 75% max
V1 19982.0 -0.271996 3.441625 -11.876451 -2.737146 -0.747917 1.840112 15.493002
V2 19982.0 0.440430 3.150784 -12.319951 -1.640674 0.471536 2.543967 13.089269
V3 20000.0 2.484699 3.388963 -10.708139 0.206860 2.255786 4.566165 17.090919
V4 20000.0 -0.083152 3.431595 -15.082052 -2.347660 -0.135241 2.130615 13.236381
V5 20000.0 -0.053752 2.104801 -8.603361 -1.535607 -0.101952 1.340480 8.133797
V6 20000.0 -0.995443 2.040970 -10.227147 -2.347238 -1.000515 0.380330 6.975847
V7 20000.0 -0.879325 1.761626 -7.949681 -2.030926 -0.917179 0.223695 8.006091
V8 20000.0 -0.548195 3.295756 -15.657561 -2.642665 -0.389085 1.722965 11.679495
V9 20000.0 -0.016808 2.160568 -8.596313 -1.494973 -0.067597 1.409203 8.137580
V10 20000.0 -0.012998 2.193201 -9.853957 -1.411212 0.100973 1.477045 8.108472
V11 20000.0 -1.895393 3.124322 -14.832058 -3.922404 -1.921237 0.118906 11.826433
V12 20000.0 1.604825 2.930454 -12.948007 -0.396514 1.507841 3.571454 15.080698
V13 20000.0 1.580486 2.874658 -13.228247 -0.223545 1.637185 3.459886 15.419616
V14 20000.0 -0.950632 1.789651 -7.738593 -2.170741 -0.957163 0.270677 5.670664
V15 20000.0 -2.414993 3.354974 -16.416606 -4.415322 -2.382617 -0.359052 12.246455
V16 20000.0 -2.925225 4.221717 -20.374158 -5.634240 -2.682705 -0.095046 13.583212
V17 20000.0 -0.134261 3.345462 -14.091184 -2.215611 -0.014580 2.068751 16.756432
V18 20000.0 1.189347 2.592276 -11.643994 -0.403917 0.883398 2.571770 13.179863
V19 20000.0 1.181808 3.396925 -13.491784 -1.050168 1.279061 3.493299 13.237742
V20 20000.0 0.023608 3.669477 -13.922659 -2.432953 0.033415 2.512372 16.052339
V21 20000.0 -3.611252 3.567690 -17.956231 -5.930360 -3.532888 -1.265884 13.840473
V22 20000.0 0.951835 1.651547 -10.122095 -0.118127 0.974687 2.025594 7.409856
V23 20000.0 -0.366116 4.031860 -14.866128 -3.098756 -0.262093 2.451750 14.458734
V24 20000.0 1.134389 3.912069 -16.387147 -1.468062 0.969048 3.545975 17.163291
V25 20000.0 -0.002186 2.016740 -8.228266 -1.365178 0.025050 1.397112 8.223389
V26 20000.0 1.873785 3.435137 -11.834271 -0.337863 1.950531 4.130037 16.836410
V27 20000.0 -0.612413 4.368847 -14.904939 -3.652323 -0.884894 2.189177 17.560404
V28 20000.0 -0.883218 1.917713 -9.269489 -2.171218 -0.891073 0.375884 6.527643
V29 20000.0 -0.985625 2.684365 -12.579469 -2.787443 -1.176181 0.629773 10.722055
V30 20000.0 -0.015534 3.005258 -14.796047 -1.867114 0.184346 2.036229 12.505812
V31 20000.0 0.486842 3.461384 -13.722760 -1.817772 0.490304 2.730688 17.255090
V32 20000.0 0.303799 5.500400 -19.876502 -3.420469 0.052073 3.761722 23.633187
V33 20000.0 0.049825 3.575285 -16.898353 -2.242857 -0.066249 2.255134 16.692486
V34 20000.0 -0.462702 3.183841 -17.985094 -2.136984 -0.255008 1.436935 14.358213
V35 20000.0 2.229620 2.937102 -15.349803 0.336191 2.098633 4.064358 15.291065
V36 20000.0 1.514809 3.800860 -14.833178 -0.943809 1.566526 3.983939 19.329576
V37 20000.0 0.011316 1.788165 -5.478350 -1.255819 -0.128435 1.175533 7.467006
V38 20000.0 -0.344025 3.948147 -17.375002 -2.987638 -0.316849 2.279399 15.289923
V39 20000.0 0.890653 1.753054 -6.438880 -0.272250 0.919261 2.057540 7.759877
V40 20000.0 -0.875630 3.012155 -11.023935 -2.940193 -0.920806 1.119897 10.654265
Target 20000.0 0.055500 0.228959 0.000000 0.000000 0.000000 0.000000 1.000000
  • All variables expect for Target have negative values. We can consider them true values as it's information about the sensors.
  • Target column has a mean close to 0, the third quartile is also 0. That means we We can assume that most of the sensors are not failing.
  • V15, V16, and V21 have negative values up to the 3rd quartile.
  • V32 has the highest standard deviation with 5.5.
  • V22 has the lowest standard deviation at 1.652.
  • The mean and median of most variables are close together, we can assume a symmetrical distribution across the variables.

Exploratory Data Analysis¶

In [22]:
def histogram_boxplot(data, feature, figsize=(12, 7), kde=True, bins=None):
    """
    Boxplot and histogram combined

    data: dataframe
    feature: dataframe column
    figsize: size of figure (default (12,7))
    kde: whether to the show density curve (default False)
    bins: number of bins for histogram (default None)
    """
    f2, (ax_box2, ax_hist2) = plt.subplots(
        nrows=2,  # Number of rows of the subplot grid= 2
        sharex=True,  # x-axis will be shared among all subplots
        gridspec_kw={"height_ratios": (0.25, 0.75)},
        figsize=figsize,
    )  # creating the 2 subplots
    sns.boxplot(
        data=data, x=feature, ax=ax_box2, showmeans=True, color="violet"
    )  # boxplot will be created and a star will indicate the mean value of the column
    sns.histplot(
        data=data, x=feature, kde=kde, ax=ax_hist2, bins=bins, palette="winter"
    ) if bins else sns.histplot(
        data=data, x=feature, kde=kde, ax=ax_hist2
    )  # For histogram
    ax_hist2.axvline(
        data[feature].mean(), color="green", linestyle="--"
    )  # Add mean to the histogram
    ax_hist2.axvline(
        data[feature].median(), color="black", linestyle="-"
    )  # Add median to the histogram
In [ ]:
for feature in df.columns:
   histogram_boxplot(df, feature, figsize=(12, 7), kde=False, bins=None)
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
No description has been provided for this image
In [ ]:
#make histplots for each variable
df.hist(bins = 50, figsize = (20, 15));
plt.tight_layout()
No description has been provided for this image
In [ ]:
#make histplots for each variable
df_test.hist(bins = 50, figsize = (20, 15));
plt.tight_layout()
No description has been provided for this image
In [ ]:
#make boxplots for all the variables
df.plot.box(figsize = (20, 15));
No description has been provided for this image
In [ ]:
#make boxplots for all the variables
df_test.plot.box(figsize = (20, 15));
No description has been provided for this image
  • The distribution of all attributes (V1 to V40) is almost symmetrical.
  • As observed previously, there are positive and negative values for all predictor variables.
  • There are outliers on all predictor variables (V1 to V40), we will not treat them as they are ciphered and are considered true values.
  • Most of the sensors detect that the generator is not failing (0). * There are a few failures 1110 out of the 20000.
  • Both train and test set data looks similar
  • Since there are outliers on both the sides decision not to use standard scalar

Bivariate Analysis¶

In [ ]:
cols_list = df.select_dtypes(include=np.number).columns.tolist()
cols_list.remove("Target")

plt.figure(figsize=(20, 20))
sns.heatmap(
    df[cols_list].corr(), annot=True, vmin=-1, vmax=1, fmt=".2f", cmap="Spectral"
)
plt.show()
No description has been provided for this image
In [ ]:
 
In [ ]:
cols_list = df_test.select_dtypes(include=np.number).columns.tolist()
cols_list.remove("Target")

plt.figure(figsize=(20, 20))
sns.heatmap(
    df[cols_list].corr(), annot=True, vmin=-1, vmax=1, fmt=".2f", cmap="Spectral"
)
plt.show()
No description has been provided for this image
  • Parameters have strong correlation with some eg. v29 having strong correlation with v11 both in test and train set
  • v40 have strong negative correlation with v19

Model Evaluation Criterion¶

Model evaluation criterion¶

The nature of predictions made by the classification model will translate as follows:

  • True positives (TP) are failures correctly predicted by the model.
  • False negatives (FN) are real failures in a generator where there is no detection by model.
  • False positives (FP) are failure detections in a generator where there is no failure.

Which metric to optimize?¶

  • We need to choose the metric which will ensure that the maximum number of generator failures are predicted correctly by the model.
  • False negatives (FN) are real failures where there is no detection by the model. These will result in replacement costs. We will need to minimize it
  • False positives (FP) are detections where there is no failure. These will result in inspection costs.enhance program may even cost more than ReneWind's current situation. Therefore, we our model selection process should optimize for F1 score, which should allow us to select the most profitable model for ReneWind to implement.
In [24]:
#pulling in function to plot the loss/recall
def plot(history, name):
    """
    Function to plot loss/recall

    history: an object which stores the metrics and losses.
    name: can be one of Loss or Recall
    """
    fig, ax = plt.subplots() #Creating a subplot with figure and axes.
    plt.plot(history.history[name]) #Plotting the train recall or train loss
    plt.plot(history.history['val_'+name]) #Plotting the validation recall or validation loss

    plt.title('Model ' + name.capitalize()) #Defining the title of the plot.
    plt.ylabel(name.capitalize()) #Capitalizing the first letter.
    plt.xlabel('Epoch') #Defining the label for the x-axis.
    fig.legend(['Train', 'Validation'], loc="outside right upper") #Defining the legend, loc controls the position of the legend.

Data Preprocessing¶

In [5]:
#copying the data
df_new = df.copy()
df_test_new = df_test.copy()
In [6]:
# Separating target variable and other variables
X_train = df_new.drop(columns="Target")
Y_train = df_new["Target"]
X_test = df_test_new.drop(columns="Target")
y_test = df_test_new["Target"]
In [7]:
#Defining dataframe columns
columns = ["# hidden layers","# neurons - hidden layer","activation function - hidden layer ","# epochs","batch size","optimizer","learning rate, momentum","weight initializer","regularization","train loss","validation loss","train recall","validation recall","time (secs)"]


#Creating a pandas dataframe.
results = pd.DataFrame(columns=columns)
In [8]:
results
Out[8]:
# hidden layers # neurons - hidden layer activation function - hidden layer # epochs batch size optimizer learning rate, momentum weight initializer regularization train loss validation loss train recall validation recall time (secs)
In [10]:
# Splitting data into training, validation and test set:
# first we split data into two parts, temporary and test

# then we split the temporary set into train and validation

X_train, X_val, y_train, y_val = train_test_split(
    X_train, Y_train, test_size=0.30, random_state=1, stratify=Y_train
)
print(X_train.shape, X_val.shape, X_test.shape)
(14000, 40) (6000, 40) (5000, 40)
In [11]:
print("Number of rows in train data =", X_train.shape[0])
print("Number of rows in validation data =", X_val.shape[0])
print("Number of rows in test data =", X_test.shape[0])
Number of rows in train data = 14000
Number of rows in validation data = 6000
Number of rows in test data = 5000
In [12]:
# creating an instace of the imputer to be used
imputer = SimpleImputer(strategy="median")
In [13]:
#replace the missing values in the train, validation, and test sets
X_train = pd.DataFrame(imputer.fit_transform(X_train), columns=X_train.columns)
X_val = pd.DataFrame(imputer.fit_transform(X_val), columns=X_val.columns)
X_test = pd.DataFrame(imputer.fit_transform(X_test), columns=X_test.columns)
In [ ]:
#verify there are no missing values in the training data
X_train.isna().sum()
Out[ ]:
0
V1 0
V2 0
V3 0
V4 0
V5 0
V6 0
V7 0
V8 0
V9 0
V10 0
V11 0
V12 0
V13 0
V14 0
V15 0
V16 0
V17 0
V18 0
V19 0
V20 0
V21 0
V22 0
V23 0
V24 0
V25 0
V26 0
V27 0
V28 0
V29 0
V30 0
V31 0
V32 0
V33 0
V34 0
V35 0
V36 0
V37 0
V38 0
V39 0
V40 0

In [15]:
#verify there are no missing values in the training data
X_val.isna().sum()
Out[15]:
0
V1 0
V2 0
V3 0
V4 0
V5 0
V6 0
V7 0
V8 0
V9 0
V10 0
V11 0
V12 0
V13 0
V14 0
V15 0
V16 0
V17 0
V18 0
V19 0
V20 0
V21 0
V22 0
V23 0
V24 0
V25 0
V26 0
V27 0
V28 0
V29 0
V30 0
V31 0
V32 0
V33 0
V34 0
V35 0
V36 0
V37 0
V38 0
V39 0
V40 0

In [ ]:
X_test.isna().sum()
Out[ ]:
0
V1 0
V2 0
V3 0
V4 0
V5 0
V6 0
V7 0
V8 0
V9 0
V10 0
V11 0
V12 0
V13 0
V14 0
V15 0
V16 0
V17 0
V18 0
V19 0
V20 0
V21 0
V22 0
V23 0
V24 0
V25 0
V26 0
V27 0
V28 0
V29 0
V30 0
V31 0
V32 0
V33 0
V34 0
V35 0
V36 0
V37 0
V38 0
V39 0
V40 0

Initial Model Building (Model 0)¶

  • Let's start with a neural network consisting of
    • just one hidden layer
    • activation function of ReLU
    • SGD as the optimizer
In [157]:
def model_performance_classification(
    model, predictors, target, threshold=0.5
):
    """
    Function to compute different metrics to check classification model performance

    model: classifier
    predictors: independent variables
    target: dependent variable
    threshold: threshold for classifying the observation as class 1
    """

    # checking which probabilities are greater than threshold
    pred = model.predict(predictors) > threshold
    # pred_temp = model.predict(predictors) > threshold
    # # rounding off the above values to get classes
    # pred = np.round(pred_temp)

    acc = accuracy_score(target, pred)  # to compute Accuracy
    recall = recall_score(target, pred, average='macro')  # to compute Recall
    precision = precision_score(target, pred, average='macro')  # to compute Precision
    f1 = f1_score(target, pred, average='macro')  # to compute F1-score

    # creating a dataframe of metrics
    df_perf = pd.DataFrame(
        {"Accuracy": acc, "Recall": recall, "Precision": precision, "F1 Score": f1,}, index = [0]
    )

    return df_perf
In [17]:
#Let's start with a neural network consisting of
#just one hidden layer
#activation function of ReLU
#SGD as the optimizer
#initialize model

#clear keras session
tf.keras.backend.clear_session()

#initialize neural network
model0 = Sequential()

#hidden layer
model0.add(Dense(64, activation = 'relu', input_shape = (X_train.shape[1],)))

#output layer -- one node for binary target variable, and sigmoid for a binary classification problem
model0.add(Dense(1, activation = 'sigmoid'))
In [18]:
#defining optimizer
optimizer = keras.optimizers.SGD()

#compile model, set optimizer and loss function -- binary crossentropy for binary target variable, with recall as metric
model0.compile(optimizer = optimizer, loss = 'binary_crossentropy',metrics = ['recall'])
In [19]:
#looking at model details
model0.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense (Dense)                   │ (None, 64)             │         2,624 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 1)              │            65 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 2,689 (10.50 KB)
 Trainable params: 2,689 (10.50 KB)
 Non-trainable params: 0 (0.00 B)
In [20]:
#defining batch size and epochs
batch_size = 100
epochs = 50

#fitting model
start = time.time()
history = model0.fit(X_train, y_train, validation_data=(X_val,y_val) , batch_size=batch_size, epochs=epochs)
end=time.time()
Epoch 1/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 1s 3ms/step - loss: 0.3548 - recall: 0.4283 - val_loss: 0.1311 - val_recall: 0.4685
Epoch 2/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.1220 - recall: 0.4516 - val_loss: 0.1126 - val_recall: 0.5556
Epoch 3/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.1048 - recall: 0.5218 - val_loss: 0.1040 - val_recall: 0.6186
Epoch 4/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0960 - recall: 0.5773 - val_loss: 0.0988 - val_recall: 0.6547
Epoch 5/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0902 - recall: 0.6200 - val_loss: 0.0950 - val_recall: 0.6817
Epoch 6/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0858 - recall: 0.6645 - val_loss: 0.0920 - val_recall: 0.6937
Epoch 7/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0824 - recall: 0.6847 - val_loss: 0.0896 - val_recall: 0.7027
Epoch 8/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0795 - recall: 0.7006 - val_loss: 0.0875 - val_recall: 0.7117
Epoch 9/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0770 - recall: 0.7192 - val_loss: 0.0857 - val_recall: 0.7237
Epoch 10/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0749 - recall: 0.7296 - val_loss: 0.0840 - val_recall: 0.7417
Epoch 11/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0730 - recall: 0.7445 - val_loss: 0.0826 - val_recall: 0.7508
Epoch 12/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0713 - recall: 0.7568 - val_loss: 0.0813 - val_recall: 0.7538
Epoch 13/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0698 - recall: 0.7577 - val_loss: 0.0801 - val_recall: 0.7598
Epoch 14/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0685 - recall: 0.7648 - val_loss: 0.0790 - val_recall: 0.7748
Epoch 15/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0673 - recall: 0.7687 - val_loss: 0.0780 - val_recall: 0.7808
Epoch 16/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0662 - recall: 0.7762 - val_loss: 0.0771 - val_recall: 0.7808
Epoch 17/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0651 - recall: 0.7798 - val_loss: 0.0764 - val_recall: 0.7808
Epoch 18/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0642 - recall: 0.7850 - val_loss: 0.0757 - val_recall: 0.7868
Epoch 19/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0633 - recall: 0.7850 - val_loss: 0.0750 - val_recall: 0.7868
Epoch 20/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0625 - recall: 0.7854 - val_loss: 0.0744 - val_recall: 0.7868
Epoch 21/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0618 - recall: 0.7863 - val_loss: 0.0739 - val_recall: 0.7868
Epoch 22/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0611 - recall: 0.7884 - val_loss: 0.0733 - val_recall: 0.7898
Epoch 23/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0604 - recall: 0.7887 - val_loss: 0.0729 - val_recall: 0.7898
Epoch 24/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0598 - recall: 0.7916 - val_loss: 0.0724 - val_recall: 0.7898
Epoch 25/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0592 - recall: 0.7959 - val_loss: 0.0720 - val_recall: 0.7958
Epoch 26/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0586 - recall: 0.7941 - val_loss: 0.0717 - val_recall: 0.7958
Epoch 27/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0581 - recall: 0.7941 - val_loss: 0.0713 - val_recall: 0.7958
Epoch 28/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0576 - recall: 0.7934 - val_loss: 0.0710 - val_recall: 0.8018
Epoch 29/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0571 - recall: 0.7985 - val_loss: 0.0707 - val_recall: 0.8018
Epoch 30/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0567 - recall: 0.7988 - val_loss: 0.0704 - val_recall: 0.8018
Epoch 31/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0562 - recall: 0.8032 - val_loss: 0.0701 - val_recall: 0.7988
Epoch 32/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0558 - recall: 0.8099 - val_loss: 0.0698 - val_recall: 0.7988
Epoch 33/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 3ms/step - loss: 0.0554 - recall: 0.8117 - val_loss: 0.0696 - val_recall: 0.7988
Epoch 34/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0551 - recall: 0.8127 - val_loss: 0.0693 - val_recall: 0.7988
Epoch 35/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 1s 2ms/step - loss: 0.0547 - recall: 0.8127 - val_loss: 0.0691 - val_recall: 0.7988
Epoch 36/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0544 - recall: 0.8127 - val_loss: 0.0689 - val_recall: 0.7988
Epoch 37/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0540 - recall: 0.8127 - val_loss: 0.0687 - val_recall: 0.7988
Epoch 38/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0537 - recall: 0.8124 - val_loss: 0.0685 - val_recall: 0.8018
Epoch 39/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0534 - recall: 0.8143 - val_loss: 0.0683 - val_recall: 0.8018
Epoch 40/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0531 - recall: 0.8168 - val_loss: 0.0681 - val_recall: 0.8048
Epoch 41/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0528 - recall: 0.8171 - val_loss: 0.0679 - val_recall: 0.8048
Epoch 42/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0525 - recall: 0.8171 - val_loss: 0.0678 - val_recall: 0.8048
Epoch 43/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0523 - recall: 0.8171 - val_loss: 0.0676 - val_recall: 0.8048
Epoch 44/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0520 - recall: 0.8229 - val_loss: 0.0675 - val_recall: 0.8048
Epoch 45/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0518 - recall: 0.8204 - val_loss: 0.0673 - val_recall: 0.8048
Epoch 46/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0515 - recall: 0.8216 - val_loss: 0.0672 - val_recall: 0.8048
Epoch 47/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0513 - recall: 0.8219 - val_loss: 0.0671 - val_recall: 0.8048
Epoch 48/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0511 - recall: 0.8219 - val_loss: 0.0670 - val_recall: 0.8048
Epoch 49/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0508 - recall: 0.8219 - val_loss: 0.0668 - val_recall: 0.8048
Epoch 50/50
140/140 ━━━━━━━━━━━━━━━━━━━━ 0s 2ms/step - loss: 0.0506 - recall: 0.8219 - val_loss: 0.0667 - val_recall: 0.8048
In [25]:
#plot model's loss
plot(history, 'loss')
No description has been provided for this image
In [26]:
#plot model's recall
plot(history,'recall')
No description has been provided for this image
In [211]:
model_0_train_perf = model_performance_classification(model0,X_train, y_train)
model_0_train_perf
438/438 ━━━━━━━━━━━━━━━━━━━━ 0s 673us/step
Out[211]:
Accuracy Recall Precision F1 Score
0 0.989643 0.916383 0.983115 0.946958
In [212]:
model_0_val_perf = model_performance_classification(model0, X_val, y_val)
model_0_val_perf
188/188 ━━━━━━━━━━━━━━━━━━━━ 0s 708us/step
Out[212]:
Accuracy Recall Precision F1 Score
0 0.9875 0.90152 0.976335 0.935333

Model Performance Improvement¶

  • We have steep drops in loss up until around the 3rd epoch
  • Our loss starts to flatten out around 3 epochs, and the validation loss is higher than the training loss after about 2 epochs
  • When it comes to recall, both show steep improvement until around 10 epochs, when it starts to flatten out
  • Validation performance is overall better than training performance until around 14 epochs, which could mean this is where the model starts to overfit
In [28]:
#add model to our results df
results.loc[0] = [
  1, #hidden layers
  64, #neurons/layer
  "relu", #activation function
  epochs, #epochs
  batch_size, #batch size
  "SGD", #optimizer
  [0.001,"-"], # learning rate, momentum
  "-", #weight initializer
  "-", #regularization
  history.history["loss"][-1], #train loss
  history.history["val_loss"][-1], #val loss
  history.history["recall"][-1], #train recall
  history.history["val_recall"][-1], #val recall
  round(end-start,2) #trt
  ]
results
Out[28]:
# hidden layers # neurons - hidden layer activation function - hidden layer # epochs batch size optimizer learning rate, momentum weight initializer regularization train loss validation loss train recall validation recall time (secs)
0 1 64 relu 50 100 SGD [0.001, -] - - 0.051616 0.066721 0.83269 0.804805 14.1

Model 1¶

Plan:¶

  • Two hidden layers -- 64, 128
  • Activation function -- relu, relu
  • SGD optimizer
In [29]:
#initialize model

#clear keras session
tf.keras.backend.clear_session()

#initialize neural network
model1 = Sequential()

#hidden layer
model1.add(Dense(64, activation = 'relu', input_shape = (X_train.shape[1],)))

#hidden layer
model1.add(Dense(128, activation = 'relu'))

#output layer -- one node for binary target variable, and sigmoid for a binary classification problem
model1.add(Dense(1, activation = 'sigmoid'))
In [30]:
#looking at model details
model1.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense (Dense)                   │ (None, 64)             │         2,624 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 128)            │         8,320 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_2 (Dense)                 │ (None, 1)              │           129 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 11,073 (43.25 KB)
 Trainable params: 11,073 (43.25 KB)
 Non-trainable params: 0 (0.00 B)
In [31]:
#defining optimizer
optimizer = keras.optimizers.SGD()

#compile model, set optimizer and loss function -- binary crossentropy for binary target variable, with recall as metric
model1.compile(optimizer = optimizer, loss = 'binary_crossentropy', metrics = ['recall'])
In [32]:
#defining batch size and epochs
batch_size = 100
epochs = 50

#fitting model
start = time.time()
history = model1.fit(X_train, y_train, validation_data = (X_val, y_val), batch_size = batch_size, epochs = epochs, verbose = 2)
end = time.time()
Epoch 1/50
140/140 - 1s - 5ms/step - loss: 0.1678 - recall: 0.2046 - val_loss: 0.1341 - val_recall: 0.3423
Epoch 2/50
140/140 - 0s - 2ms/step - loss: 0.1210 - recall: 0.4118 - val_loss: 0.1147 - val_recall: 0.5105
Epoch 3/50
140/140 - 0s - 2ms/step - loss: 0.1060 - recall: 0.5135 - val_loss: 0.1046 - val_recall: 0.6096
Epoch 4/50
140/140 - 0s - 2ms/step - loss: 0.0969 - recall: 0.5753 - val_loss: 0.0980 - val_recall: 0.6486
Epoch 5/50
140/140 - 0s - 2ms/step - loss: 0.0904 - recall: 0.6100 - val_loss: 0.0930 - val_recall: 0.6847
Epoch 6/50
140/140 - 0s - 2ms/step - loss: 0.0854 - recall: 0.6345 - val_loss: 0.0892 - val_recall: 0.7207
Epoch 7/50
140/140 - 0s - 2ms/step - loss: 0.0814 - recall: 0.6615 - val_loss: 0.0861 - val_recall: 0.7267
Epoch 8/50
140/140 - 0s - 2ms/step - loss: 0.0780 - recall: 0.6808 - val_loss: 0.0835 - val_recall: 0.7447
Epoch 9/50
140/140 - 0s - 2ms/step - loss: 0.0751 - recall: 0.7104 - val_loss: 0.0813 - val_recall: 0.7568
Epoch 10/50
140/140 - 0s - 2ms/step - loss: 0.0725 - recall: 0.7259 - val_loss: 0.0794 - val_recall: 0.7628
Epoch 11/50
140/140 - 0s - 2ms/step - loss: 0.0703 - recall: 0.7375 - val_loss: 0.0778 - val_recall: 0.7658
Epoch 12/50
140/140 - 0s - 2ms/step - loss: 0.0684 - recall: 0.7503 - val_loss: 0.0764 - val_recall: 0.7778
Epoch 13/50
140/140 - 0s - 2ms/step - loss: 0.0666 - recall: 0.7593 - val_loss: 0.0751 - val_recall: 0.7808
Epoch 14/50
140/140 - 0s - 2ms/step - loss: 0.0651 - recall: 0.7645 - val_loss: 0.0739 - val_recall: 0.7898
Epoch 15/50
140/140 - 0s - 2ms/step - loss: 0.0636 - recall: 0.7709 - val_loss: 0.0728 - val_recall: 0.8018
Epoch 16/50
140/140 - 0s - 2ms/step - loss: 0.0623 - recall: 0.7812 - val_loss: 0.0718 - val_recall: 0.8078
Epoch 17/50
140/140 - 0s - 2ms/step - loss: 0.0611 - recall: 0.7851 - val_loss: 0.0709 - val_recall: 0.8168
Epoch 18/50
140/140 - 0s - 2ms/step - loss: 0.0600 - recall: 0.7864 - val_loss: 0.0701 - val_recall: 0.8168
Epoch 19/50
140/140 - 0s - 2ms/step - loss: 0.0590 - recall: 0.7928 - val_loss: 0.0694 - val_recall: 0.8168
Epoch 20/50
140/140 - 0s - 2ms/step - loss: 0.0580 - recall: 0.7941 - val_loss: 0.0687 - val_recall: 0.8168
Epoch 21/50
140/140 - 0s - 2ms/step - loss: 0.0571 - recall: 0.7992 - val_loss: 0.0680 - val_recall: 0.8168
Epoch 22/50
140/140 - 0s - 2ms/step - loss: 0.0563 - recall: 0.8082 - val_loss: 0.0674 - val_recall: 0.8168
Epoch 23/50
140/140 - 0s - 2ms/step - loss: 0.0555 - recall: 0.8108 - val_loss: 0.0668 - val_recall: 0.8168
Epoch 24/50
140/140 - 0s - 2ms/step - loss: 0.0547 - recall: 0.8147 - val_loss: 0.0663 - val_recall: 0.8198
Epoch 25/50
140/140 - 0s - 2ms/step - loss: 0.0540 - recall: 0.8147 - val_loss: 0.0658 - val_recall: 0.8198
Epoch 26/50
140/140 - 0s - 2ms/step - loss: 0.0534 - recall: 0.8160 - val_loss: 0.0654 - val_recall: 0.8228
Epoch 27/50
140/140 - 0s - 2ms/step - loss: 0.0528 - recall: 0.8172 - val_loss: 0.0650 - val_recall: 0.8228
Epoch 28/50
140/140 - 0s - 2ms/step - loss: 0.0522 - recall: 0.8198 - val_loss: 0.0646 - val_recall: 0.8228
Epoch 29/50
140/140 - 0s - 2ms/step - loss: 0.0516 - recall: 0.8211 - val_loss: 0.0642 - val_recall: 0.8228
Epoch 30/50
140/140 - 0s - 2ms/step - loss: 0.0511 - recall: 0.8237 - val_loss: 0.0638 - val_recall: 0.8228
Epoch 31/50
140/140 - 0s - 3ms/step - loss: 0.0506 - recall: 0.8250 - val_loss: 0.0635 - val_recall: 0.8258
Epoch 32/50
140/140 - 0s - 3ms/step - loss: 0.0501 - recall: 0.8275 - val_loss: 0.0632 - val_recall: 0.8288
Epoch 33/50
140/140 - 0s - 2ms/step - loss: 0.0496 - recall: 0.8314 - val_loss: 0.0629 - val_recall: 0.8348
Epoch 34/50
140/140 - 0s - 2ms/step - loss: 0.0492 - recall: 0.8327 - val_loss: 0.0626 - val_recall: 0.8348
Epoch 35/50
140/140 - 0s - 3ms/step - loss: 0.0487 - recall: 0.8327 - val_loss: 0.0624 - val_recall: 0.8348
Epoch 36/50
140/140 - 0s - 2ms/step - loss: 0.0483 - recall: 0.8340 - val_loss: 0.0622 - val_recall: 0.8378
Epoch 37/50
140/140 - 0s - 2ms/step - loss: 0.0479 - recall: 0.8353 - val_loss: 0.0620 - val_recall: 0.8378
Epoch 38/50
140/140 - 0s - 2ms/step - loss: 0.0475 - recall: 0.8353 - val_loss: 0.0617 - val_recall: 0.8408
Epoch 39/50
140/140 - 0s - 2ms/step - loss: 0.0471 - recall: 0.8366 - val_loss: 0.0616 - val_recall: 0.8408
Epoch 40/50
140/140 - 0s - 2ms/step - loss: 0.0468 - recall: 0.8417 - val_loss: 0.0614 - val_recall: 0.8408
Epoch 41/50
140/140 - 0s - 2ms/step - loss: 0.0464 - recall: 0.8430 - val_loss: 0.0612 - val_recall: 0.8408
Epoch 42/50
140/140 - 0s - 2ms/step - loss: 0.0461 - recall: 0.8443 - val_loss: 0.0611 - val_recall: 0.8408
Epoch 43/50
140/140 - 0s - 2ms/step - loss: 0.0458 - recall: 0.8443 - val_loss: 0.0609 - val_recall: 0.8408
Epoch 44/50
140/140 - 0s - 2ms/step - loss: 0.0455 - recall: 0.8456 - val_loss: 0.0608 - val_recall: 0.8408
Epoch 45/50
140/140 - 0s - 2ms/step - loss: 0.0452 - recall: 0.8468 - val_loss: 0.0607 - val_recall: 0.8408
Epoch 46/50
140/140 - 0s - 2ms/step - loss: 0.0449 - recall: 0.8507 - val_loss: 0.0605 - val_recall: 0.8408
Epoch 47/50
140/140 - 0s - 2ms/step - loss: 0.0446 - recall: 0.8520 - val_loss: 0.0604 - val_recall: 0.8408
Epoch 48/50
140/140 - 0s - 2ms/step - loss: 0.0443 - recall: 0.8533 - val_loss: 0.0603 - val_recall: 0.8408
Epoch 49/50
140/140 - 0s - 2ms/step - loss: 0.0440 - recall: 0.8546 - val_loss: 0.0602 - val_recall: 0.8408
Epoch 50/50
140/140 - 0s - 2ms/step - loss: 0.0438 - recall: 0.8584 - val_loss: 0.0601 - val_recall: 0.8408
In [33]:
#plot model's loss
plot(history, 'loss')
No description has been provided for this image
In [34]:
#plot model's recall
plot(history,'recall')
No description has been provided for this image
  • We have steep drops in loss until about the 5th epoch before it begins slowing down and plateauing
  • Validation loss is higher than training loss after about the 4th epoch
  • For recall, both show steep improvement until about the 8th epoch, before it starts to slow down and plateau
  • Validation performance is overall better than training performance until about the 21st epoch, when both are very similar
  • Training performance is better after the 40th epoch
In [218]:
model_1_train_perf = model_performance_classification(model1, X_train, y_train)
model_1_train_perf
438/438 ━━━━━━━━━━━━━━━━━━━━ 0s 709us/step
Out[218]:
Accuracy Recall Precision F1 Score
0 0.991357 0.929405 0.987085 0.956197
In [219]:
model_1_val_perf = model_performance_classification(model1, X_val, y_val)
model_1_val_perf
188/188 ━━━━━━━━━━━━━━━━━━━━ 0s 742us/step
Out[219]:
Accuracy Recall Precision F1 Score
0 0.99 0.919803 0.983166 0.948977

Model 2¶

Plan:¶

  • Two hidden layers -- 64, 64
  • activation function -- relu, tanh
  • SGD optimizer
In [37]:
#initialize model

#clear keras session
tf.keras.backend.clear_session()

#initialize neural network
model2 = Sequential()

#hidden layer
model2.add(Dense(64, activation = 'relu', input_shape = (X_train.shape[1],)))

#hidden layer
model2.add(Dense(64, activation = 'tanh'))

#output layer -- one node for binary target variable, and sigmoid for a binary classification problem
model2.add(Dense(1, activation = 'sigmoid'))
In [38]:
model2.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense (Dense)                   │ (None, 64)             │         2,624 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 64)             │         4,160 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_2 (Dense)                 │ (None, 1)              │            65 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 6,849 (26.75 KB)
 Trainable params: 6,849 (26.75 KB)
 Non-trainable params: 0 (0.00 B)
In [39]:
#defining optimizer
optimizer = keras.optimizers.SGD()

#compile model, set optimizer and loss function -- binary crossentropy for binary target variable, with recall as metric
model2.compile(optimizer = optimizer, loss = 'binary_crossentropy', metrics = ['recall'])
In [40]:
#defining batch size and epochs
batch_size = 100
epochs = 50

#fitting model
start = time.time()
history = model2.fit(X_train, y_train, validation_data = (X_val, y_val), batch_size = batch_size, epochs = epochs, verbose = 2)
end=time.time()
Epoch 1/50
140/140 - 1s - 5ms/step - loss: 0.1799 - recall: 0.1802 - val_loss: 0.1355 - val_recall: 0.2252
Epoch 2/50
140/140 - 0s - 2ms/step - loss: 0.1246 - recall: 0.2896 - val_loss: 0.1182 - val_recall: 0.3694
Epoch 3/50
140/140 - 0s - 2ms/step - loss: 0.1097 - recall: 0.4003 - val_loss: 0.1080 - val_recall: 0.4505
Epoch 4/50
140/140 - 0s - 2ms/step - loss: 0.0999 - recall: 0.4762 - val_loss: 0.1009 - val_recall: 0.5045
Epoch 5/50
140/140 - 0s - 2ms/step - loss: 0.0928 - recall: 0.5251 - val_loss: 0.0956 - val_recall: 0.5465
Epoch 6/50
140/140 - 0s - 2ms/step - loss: 0.0873 - recall: 0.5663 - val_loss: 0.0914 - val_recall: 0.5796
Epoch 7/50
140/140 - 0s - 3ms/step - loss: 0.0828 - recall: 0.6010 - val_loss: 0.0879 - val_recall: 0.6126
Epoch 8/50
140/140 - 1s - 5ms/step - loss: 0.0790 - recall: 0.6281 - val_loss: 0.0850 - val_recall: 0.6396
Epoch 9/50
140/140 - 0s - 3ms/step - loss: 0.0758 - recall: 0.6499 - val_loss: 0.0825 - val_recall: 0.6607
Epoch 10/50
140/140 - 0s - 3ms/step - loss: 0.0730 - recall: 0.6680 - val_loss: 0.0802 - val_recall: 0.6817
Epoch 11/50
140/140 - 0s - 2ms/step - loss: 0.0706 - recall: 0.6911 - val_loss: 0.0783 - val_recall: 0.6937
Epoch 12/50
140/140 - 0s - 2ms/step - loss: 0.0684 - recall: 0.7066 - val_loss: 0.0765 - val_recall: 0.7087
Epoch 13/50
140/140 - 0s - 2ms/step - loss: 0.0665 - recall: 0.7259 - val_loss: 0.0749 - val_recall: 0.7237
Epoch 14/50
140/140 - 0s - 2ms/step - loss: 0.0648 - recall: 0.7349 - val_loss: 0.0735 - val_recall: 0.7267
Epoch 15/50
140/140 - 0s - 2ms/step - loss: 0.0632 - recall: 0.7413 - val_loss: 0.0722 - val_recall: 0.7327
Epoch 16/50
140/140 - 0s - 2ms/step - loss: 0.0618 - recall: 0.7568 - val_loss: 0.0710 - val_recall: 0.7327
Epoch 17/50
140/140 - 0s - 2ms/step - loss: 0.0605 - recall: 0.7658 - val_loss: 0.0699 - val_recall: 0.7538
Epoch 18/50
140/140 - 0s - 2ms/step - loss: 0.0594 - recall: 0.7709 - val_loss: 0.0689 - val_recall: 0.7568
Epoch 19/50
140/140 - 0s - 2ms/step - loss: 0.0583 - recall: 0.7761 - val_loss: 0.0680 - val_recall: 0.7598
Epoch 20/50
140/140 - 0s - 2ms/step - loss: 0.0573 - recall: 0.7838 - val_loss: 0.0671 - val_recall: 0.7688
Epoch 21/50
140/140 - 0s - 2ms/step - loss: 0.0564 - recall: 0.7876 - val_loss: 0.0663 - val_recall: 0.7808
Epoch 22/50
140/140 - 0s - 2ms/step - loss: 0.0556 - recall: 0.7928 - val_loss: 0.0656 - val_recall: 0.7898
Epoch 23/50
140/140 - 0s - 2ms/step - loss: 0.0548 - recall: 0.7954 - val_loss: 0.0649 - val_recall: 0.7928
Epoch 24/50
140/140 - 0s - 2ms/step - loss: 0.0540 - recall: 0.8005 - val_loss: 0.0642 - val_recall: 0.8018
Epoch 25/50
140/140 - 0s - 2ms/step - loss: 0.0533 - recall: 0.8018 - val_loss: 0.0636 - val_recall: 0.8048
Epoch 26/50
140/140 - 0s - 2ms/step - loss: 0.0526 - recall: 0.8031 - val_loss: 0.0630 - val_recall: 0.8048
Epoch 27/50
140/140 - 0s - 2ms/step - loss: 0.0520 - recall: 0.8044 - val_loss: 0.0624 - val_recall: 0.8108
Epoch 28/50
140/140 - 0s - 2ms/step - loss: 0.0514 - recall: 0.8057 - val_loss: 0.0619 - val_recall: 0.8108
Epoch 29/50
140/140 - 0s - 2ms/step - loss: 0.0508 - recall: 0.8121 - val_loss: 0.0614 - val_recall: 0.8108
Epoch 30/50
140/140 - 0s - 2ms/step - loss: 0.0503 - recall: 0.8185 - val_loss: 0.0609 - val_recall: 0.8108
Epoch 31/50
140/140 - 0s - 2ms/step - loss: 0.0498 - recall: 0.8224 - val_loss: 0.0605 - val_recall: 0.8108
Epoch 32/50
140/140 - 0s - 2ms/step - loss: 0.0493 - recall: 0.8250 - val_loss: 0.0601 - val_recall: 0.8108
Epoch 33/50
140/140 - 0s - 2ms/step - loss: 0.0488 - recall: 0.8288 - val_loss: 0.0596 - val_recall: 0.8138
Epoch 34/50
140/140 - 0s - 2ms/step - loss: 0.0484 - recall: 0.8288 - val_loss: 0.0593 - val_recall: 0.8168
Epoch 35/50
140/140 - 0s - 2ms/step - loss: 0.0479 - recall: 0.8288 - val_loss: 0.0589 - val_recall: 0.8168
Epoch 36/50
140/140 - 0s - 2ms/step - loss: 0.0475 - recall: 0.8288 - val_loss: 0.0585 - val_recall: 0.8198
Epoch 37/50
140/140 - 0s - 2ms/step - loss: 0.0471 - recall: 0.8327 - val_loss: 0.0582 - val_recall: 0.8198
Epoch 38/50
140/140 - 0s - 2ms/step - loss: 0.0467 - recall: 0.8366 - val_loss: 0.0578 - val_recall: 0.8198
Epoch 39/50
140/140 - 0s - 2ms/step - loss: 0.0463 - recall: 0.8366 - val_loss: 0.0575 - val_recall: 0.8198
Epoch 40/50
140/140 - 0s - 2ms/step - loss: 0.0460 - recall: 0.8378 - val_loss: 0.0572 - val_recall: 0.8228
Epoch 41/50
140/140 - 0s - 2ms/step - loss: 0.0456 - recall: 0.8391 - val_loss: 0.0569 - val_recall: 0.8228
Epoch 42/50
140/140 - 0s - 2ms/step - loss: 0.0453 - recall: 0.8391 - val_loss: 0.0566 - val_recall: 0.8228
Epoch 43/50
140/140 - 0s - 2ms/step - loss: 0.0449 - recall: 0.8404 - val_loss: 0.0563 - val_recall: 0.8228
Epoch 44/50
140/140 - 0s - 2ms/step - loss: 0.0446 - recall: 0.8417 - val_loss: 0.0560 - val_recall: 0.8228
Epoch 45/50
140/140 - 0s - 2ms/step - loss: 0.0443 - recall: 0.8430 - val_loss: 0.0558 - val_recall: 0.8228
Epoch 46/50
140/140 - 0s - 2ms/step - loss: 0.0440 - recall: 0.8456 - val_loss: 0.0555 - val_recall: 0.8228
Epoch 47/50
140/140 - 0s - 2ms/step - loss: 0.0437 - recall: 0.8468 - val_loss: 0.0553 - val_recall: 0.8258
Epoch 48/50
140/140 - 0s - 2ms/step - loss: 0.0434 - recall: 0.8481 - val_loss: 0.0551 - val_recall: 0.8258
Epoch 49/50
140/140 - 0s - 2ms/step - loss: 0.0431 - recall: 0.8481 - val_loss: 0.0548 - val_recall: 0.8288
Epoch 50/50
140/140 - 0s - 2ms/step - loss: 0.0429 - recall: 0.8481 - val_loss: 0.0546 - val_recall: 0.8288
In [41]:
#plot model's loss
plot(history, 'loss')
No description has been provided for this image
In [42]:
#plot model's recall
plot(history,'recall')
No description has been provided for this image
  • We have steep drops in loss until around the 5th epoch, then the decreases are less steep
  • Validation loss is higher than training loss after around 5 epochs
  • For recall, we have steep increases until around 10 epochs, when it starts slowing and plateauing
  • Validation performance is similar to the training performanceuntil around 30 epochs, when training recall becomes higher
  • That said, there is a slim gap
In [172]:
model_2_train_perf = model_performance_classification(model2, X_train, y_train)
model_2_train_perf
438/438 ━━━━━━━━━━━━━━━━━━━━ 0s 683us/step
Out[172]:
Accuracy Recall Precision F1 Score
0 0.991214 0.926301 0.989027 0.955241
In [173]:
model_2_val_perf = model_performance_classification(model2, X_val, y_val)
model_2_val_perf
188/188 ━━━━━━━━━━━━━━━━━━━━ 0s 703us/step
Out[173]:
Accuracy Recall Precision F1 Score
0 0.989667 0.913973 0.98612 0.946789
In [44]:
#add model to our results df
results.loc[2] = [
  2, #hidden layers
  [64, 64], #neurons/layer
  ["relu", "tanh"], #activation function
  epochs, #epochs
  batch_size, #batch size
  "SGD", #optimizer
  [0.001,"-"], # learning rate, momentum
  ["Xav", "Xav", "Xav"], #weight initializer
  "-", #regularization
  history.history["loss"][-1], #train loss
  history.history["val_loss"][-1], #val loss
  history.history["recall"][-1], #train recall
  history.history["val_recall"][-1], #val recall
  round(end-start,2) #trt
  ]
results
Out[44]:
# hidden layers # neurons - hidden layer activation function - hidden layer # epochs batch size optimizer learning rate, momentum weight initializer regularization train loss validation loss train recall validation recall time (secs)
0 1 64 relu 50 100 SGD [0.001, -] - - 0.051616 0.066721 0.832690 0.804805 14.10
1 2 [64, 128] [relu, relu] 50 100 SGD [0.001, -] [Xav, Xav, Xav] - 0.043756 0.060069 0.858430 0.840841 14.28
2 2 [64, 64] [relu, tanh] 50 100 SGD [0.001, -] [Xav, Xav, Xav] - 0.042877 0.054599 0.848134 0.828829 14.53

Model 3¶

Plan¶

  • Two hidden layers -- 64, 128
  • activation function -- relu, relu
  • SGD optimizer with momentum
  • Reduced learning rate of 1e-4, and increase epochs to 100
In [45]:
#initialize model

#clear keras session
tf.keras.backend.clear_session()

#initialize neural network
model3 = Sequential()

#hidden layer
model3.add(Dense(64, activation = 'relu', input_shape = (X_train.shape[1],)))

#hidden layer
model3.add(Dense(128, activation = 'relu'))

#output layer -- one node for binary target variable, and sigmoid for a binary classification problem
model3.add(Dense(1, activation = 'sigmoid'))
In [46]:
model3.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense (Dense)                   │ (None, 64)             │         2,624 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 128)            │         8,320 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_2 (Dense)                 │ (None, 1)              │           129 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 11,073 (43.25 KB)
 Trainable params: 11,073 (43.25 KB)
 Non-trainable params: 0 (0.00 B)
In [47]:
#defining optimizer
mom = 0.9
optimizer = keras.optimizers.SGD(momentum = mom)

#compile model, set optimizer and loss function -- binary crossentropy for binary target variable, with recall as metric
model3.compile(optimizer = optimizer, loss = 'binary_crossentropy', metrics = ['recall'])
In [48]:
#defining batch size and epochs
batch_size = 100
epochs = 50

#fitting model
start = time.time()
history = model3.fit(X_train, y_train, validation_data = (X_val, y_val), batch_size = batch_size, epochs = epochs, verbose = 2)
end=time.time()
Epoch 1/50
140/140 - 1s - 6ms/step - loss: 0.1490 - recall: 0.5290 - val_loss: 0.0856 - val_recall: 0.7267
Epoch 2/50
140/140 - 0s - 2ms/step - loss: 0.0703 - recall: 0.7683 - val_loss: 0.0718 - val_recall: 0.7928
Epoch 3/50
140/140 - 0s - 2ms/step - loss: 0.0592 - recall: 0.8069 - val_loss: 0.0656 - val_recall: 0.8078
Epoch 4/50
140/140 - 0s - 2ms/step - loss: 0.0533 - recall: 0.8301 - val_loss: 0.0623 - val_recall: 0.8258
Epoch 5/50
140/140 - 0s - 2ms/step - loss: 0.0494 - recall: 0.8456 - val_loss: 0.0601 - val_recall: 0.8348
Epoch 6/50
140/140 - 0s - 2ms/step - loss: 0.0466 - recall: 0.8559 - val_loss: 0.0585 - val_recall: 0.8378
Epoch 7/50
140/140 - 0s - 2ms/step - loss: 0.0444 - recall: 0.8649 - val_loss: 0.0573 - val_recall: 0.8378
Epoch 8/50
140/140 - 0s - 2ms/step - loss: 0.0425 - recall: 0.8687 - val_loss: 0.0564 - val_recall: 0.8378
Epoch 9/50
140/140 - 0s - 2ms/step - loss: 0.0409 - recall: 0.8777 - val_loss: 0.0556 - val_recall: 0.8438
Epoch 10/50
140/140 - 0s - 2ms/step - loss: 0.0396 - recall: 0.8855 - val_loss: 0.0550 - val_recall: 0.8468
Epoch 11/50
140/140 - 0s - 2ms/step - loss: 0.0384 - recall: 0.8855 - val_loss: 0.0545 - val_recall: 0.8468
Epoch 12/50
140/140 - 0s - 3ms/step - loss: 0.0374 - recall: 0.8919 - val_loss: 0.0542 - val_recall: 0.8468
Epoch 13/50
140/140 - 0s - 3ms/step - loss: 0.0364 - recall: 0.8945 - val_loss: 0.0539 - val_recall: 0.8468
Epoch 14/50
140/140 - 0s - 3ms/step - loss: 0.0356 - recall: 0.8945 - val_loss: 0.0538 - val_recall: 0.8529
Epoch 15/50
140/140 - 0s - 3ms/step - loss: 0.0348 - recall: 0.8958 - val_loss: 0.0536 - val_recall: 0.8559
Epoch 16/50
140/140 - 0s - 3ms/step - loss: 0.0340 - recall: 0.8970 - val_loss: 0.0535 - val_recall: 0.8559
Epoch 17/50
140/140 - 0s - 2ms/step - loss: 0.0333 - recall: 0.8996 - val_loss: 0.0534 - val_recall: 0.8559
Epoch 18/50
140/140 - 0s - 2ms/step - loss: 0.0327 - recall: 0.9009 - val_loss: 0.0535 - val_recall: 0.8529
Epoch 19/50
140/140 - 0s - 2ms/step - loss: 0.0321 - recall: 0.9009 - val_loss: 0.0535 - val_recall: 0.8589
Epoch 20/50
140/140 - 0s - 2ms/step - loss: 0.0316 - recall: 0.8996 - val_loss: 0.0534 - val_recall: 0.8559
Epoch 21/50
140/140 - 0s - 2ms/step - loss: 0.0310 - recall: 0.9022 - val_loss: 0.0533 - val_recall: 0.8589
Epoch 22/50
140/140 - 0s - 2ms/step - loss: 0.0305 - recall: 0.9035 - val_loss: 0.0534 - val_recall: 0.8619
Epoch 23/50
140/140 - 0s - 2ms/step - loss: 0.0300 - recall: 0.9035 - val_loss: 0.0532 - val_recall: 0.8589
Epoch 24/50
140/140 - 0s - 2ms/step - loss: 0.0295 - recall: 0.9035 - val_loss: 0.0533 - val_recall: 0.8589
Epoch 25/50
140/140 - 0s - 2ms/step - loss: 0.0291 - recall: 0.9035 - val_loss: 0.0533 - val_recall: 0.8589
Epoch 26/50
140/140 - 0s - 2ms/step - loss: 0.0285 - recall: 0.9035 - val_loss: 0.0534 - val_recall: 0.8619
Epoch 27/50
140/140 - 0s - 2ms/step - loss: 0.0281 - recall: 0.9035 - val_loss: 0.0534 - val_recall: 0.8589
Epoch 28/50
140/140 - 0s - 2ms/step - loss: 0.0276 - recall: 0.9073 - val_loss: 0.0535 - val_recall: 0.8589
Epoch 29/50
140/140 - 0s - 2ms/step - loss: 0.0272 - recall: 0.9060 - val_loss: 0.0534 - val_recall: 0.8619
Epoch 30/50
140/140 - 0s - 2ms/step - loss: 0.0268 - recall: 0.9099 - val_loss: 0.0536 - val_recall: 0.8709
Epoch 31/50
140/140 - 0s - 2ms/step - loss: 0.0264 - recall: 0.9099 - val_loss: 0.0533 - val_recall: 0.8709
Epoch 32/50
140/140 - 0s - 2ms/step - loss: 0.0260 - recall: 0.9086 - val_loss: 0.0538 - val_recall: 0.8709
Epoch 33/50
140/140 - 0s - 2ms/step - loss: 0.0257 - recall: 0.9099 - val_loss: 0.0538 - val_recall: 0.8739
Epoch 34/50
140/140 - 0s - 2ms/step - loss: 0.0253 - recall: 0.9099 - val_loss: 0.0541 - val_recall: 0.8769
Epoch 35/50
140/140 - 0s - 2ms/step - loss: 0.0249 - recall: 0.9112 - val_loss: 0.0542 - val_recall: 0.8769
Epoch 36/50
140/140 - 0s - 2ms/step - loss: 0.0246 - recall: 0.9138 - val_loss: 0.0549 - val_recall: 0.8769
Epoch 37/50
140/140 - 0s - 2ms/step - loss: 0.0243 - recall: 0.9151 - val_loss: 0.0549 - val_recall: 0.8769
Epoch 38/50
140/140 - 0s - 2ms/step - loss: 0.0239 - recall: 0.9151 - val_loss: 0.0548 - val_recall: 0.8769
Epoch 39/50
140/140 - 0s - 2ms/step - loss: 0.0235 - recall: 0.9163 - val_loss: 0.0551 - val_recall: 0.8739
Epoch 40/50
140/140 - 0s - 2ms/step - loss: 0.0232 - recall: 0.9163 - val_loss: 0.0555 - val_recall: 0.8739
Epoch 41/50
140/140 - 0s - 2ms/step - loss: 0.0230 - recall: 0.9189 - val_loss: 0.0556 - val_recall: 0.8739
Epoch 42/50
140/140 - 0s - 2ms/step - loss: 0.0226 - recall: 0.9202 - val_loss: 0.0556 - val_recall: 0.8769
Epoch 43/50
140/140 - 0s - 2ms/step - loss: 0.0223 - recall: 0.9202 - val_loss: 0.0560 - val_recall: 0.8739
Epoch 44/50
140/140 - 0s - 2ms/step - loss: 0.0220 - recall: 0.9228 - val_loss: 0.0564 - val_recall: 0.8739
Epoch 45/50
140/140 - 0s - 2ms/step - loss: 0.0216 - recall: 0.9241 - val_loss: 0.0567 - val_recall: 0.8739
Epoch 46/50
140/140 - 0s - 2ms/step - loss: 0.0213 - recall: 0.9241 - val_loss: 0.0569 - val_recall: 0.8769
Epoch 47/50
140/140 - 1s - 4ms/step - loss: 0.0210 - recall: 0.9241 - val_loss: 0.0569 - val_recall: 0.8769
Epoch 48/50
140/140 - 0s - 2ms/step - loss: 0.0207 - recall: 0.9254 - val_loss: 0.0576 - val_recall: 0.8739
Epoch 49/50
140/140 - 1s - 4ms/step - loss: 0.0204 - recall: 0.9241 - val_loss: 0.0578 - val_recall: 0.8679
Epoch 50/50
140/140 - 0s - 3ms/step - loss: 0.0200 - recall: 0.9254 - val_loss: 0.0583 - val_recall: 0.8709
In [49]:
#plot model's loss
plot(history, 'loss')
No description has been provided for this image
In [50]:
#plot model's recall
plot(history,'recall')
No description has been provided for this image
  • We see steep drops in loss until about the 5th or 6th epoch, then the decreases are less steep
  • Meantime, training loss continues to drop, though less quickly -- meaning we could be overfitting after this point
  • Regarding recall, both training and validation show dramatic increases until about the 4th epoch
  • After that, validation starts to plateau and oscillate, though it generally increases until about the 30th epoch, before generally levelling off
  • Training continues to improve, though at a slower rate
In [215]:
model_3_train_perf = model_performance_classification(model3, X_train, y_train)
model_3_train_perf
438/438 ━━━━━━━━━━━━━━━━━━━━ 0s 737us/step
Out[215]:
Accuracy Recall Precision F1 Score
0 0.995643 0.961958 0.99639 0.978475
In [216]:
model_3_val_perf = model_performance_classification(model3, X_val, y_val)
model_3_val_perf
188/188 ━━━━━━━━━━━━━━━━━━━━ 0s 949us/step
Out[216]:
Accuracy Recall Precision F1 Score
0 0.990833 0.934377 0.976359 0.954273
In [52]:
#add model to our results df
results.loc[3] = [
  2, #hidden layers
  [64,128],#neurons layer
  ["relu","relu"], #activation function
  epochs, #epochs
  batch_size, #batch size
  "SGD with mom", #optimizer
  [0.001,0.9], # learning rate, momentum
  ["Xav", "Xav", "Xav"], #weight initializer
  "-", #regularization
  history.history["loss"][-1], #train loss
  history.history["val_loss"][-1], #val loss
  history.history["recall"][-1], #train recall
  history.history["val_recall"][-1], #val recall
  round(end-start,2) #trt
  ]
results
Out[52]:
# hidden layers # neurons - hidden layer activation function - hidden layer # epochs batch size optimizer learning rate, momentum weight initializer regularization train loss validation loss train recall validation recall time (secs)
0 1 64 relu 50 100 SGD [0.001, -] - - 0.051616 0.066721 0.832690 0.804805 14.10
1 2 [64, 128] [relu, relu] 50 100 SGD [0.001, -] [Xav, Xav, Xav] - 0.043756 0.060069 0.858430 0.840841 14.28
2 2 [64, 64] [relu, tanh] 50 100 SGD [0.001, -] [Xav, Xav, Xav] - 0.042877 0.054599 0.848134 0.828829 14.53
3 2 [64, 128] [relu, relu] 50 100 SGD with mom [0.001, 0.9] [Xav, Xav, Xav] - 0.019997 0.058326 0.925354 0.870871 16.27
# This is formatted as code

Model 4¶

Plan:¶

  • Two hidden layers -- 64, 128
  • activation function -- relu, relu
  • SGD optimizer with momentum
  • Reduced learning rate of 1e-4, and increase epochs to 100
In [53]:
#initialize model

#clear keras session
tf.keras.backend.clear_session()

#initialize neural network
model4 = Sequential()

#hidden layer
model4.add(Dense(64, activation = 'relu', input_shape = (X_train.shape[1],)))

#hidden layer
model4.add(Dense(128, activation = 'relu'))

#output layer -- one node for binary target variable, and sigmoid for a binary classification problem
model4.add(Dense(1, activation = 'sigmoid'))
In [54]:
#looking at model details
model4.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense (Dense)                   │ (None, 64)             │         2,624 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 128)            │         8,320 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_2 (Dense)                 │ (None, 1)              │           129 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 11,073 (43.25 KB)
 Trainable params: 11,073 (43.25 KB)
 Non-trainable params: 0 (0.00 B)
In [55]:
#defining optimizer
lr = 1e-4
mom = 0.9
optimizer = keras.optimizers.SGD(momentum = mom, learning_rate = lr)

#compile model, set optimizer and loss function -- binary crossentropy for binary target variable, with recall as metric
model4.compile(optimizer = optimizer, loss = 'binary_crossentropy', metrics = ['recall'])
In [56]:
#defining batch size and epochs
batch_size = 100
epochs = 50

#fitting model
start = time.time()
history = model4.fit(X_train, y_train, validation_data = (X_val, y_val), batch_size = batch_size, epochs = epochs, verbose = 2)
end = time.time()
Epoch 1/50
140/140 - 1s - 6ms/step - loss: 0.3403 - recall: 0.1879 - val_loss: 0.1945 - val_recall: 0.0270
Epoch 2/50
140/140 - 0s - 2ms/step - loss: 0.1839 - recall: 0.0322 - val_loss: 0.1769 - val_recall: 0.0330
Epoch 3/50
140/140 - 0s - 2ms/step - loss: 0.1713 - recall: 0.0528 - val_loss: 0.1674 - val_recall: 0.0631
Epoch 4/50
140/140 - 0s - 2ms/step - loss: 0.1627 - recall: 0.0798 - val_loss: 0.1601 - val_recall: 0.1141
Epoch 5/50
140/140 - 0s - 2ms/step - loss: 0.1559 - recall: 0.1171 - val_loss: 0.1542 - val_recall: 0.1441
Epoch 6/50
140/140 - 0s - 2ms/step - loss: 0.1502 - recall: 0.1660 - val_loss: 0.1493 - val_recall: 0.1712
Epoch 7/50
140/140 - 0s - 2ms/step - loss: 0.1453 - recall: 0.1995 - val_loss: 0.1450 - val_recall: 0.2012
Epoch 8/50
140/140 - 0s - 2ms/step - loss: 0.1411 - recall: 0.2317 - val_loss: 0.1413 - val_recall: 0.2312
Epoch 9/50
140/140 - 0s - 2ms/step - loss: 0.1374 - recall: 0.2728 - val_loss: 0.1379 - val_recall: 0.2583
Epoch 10/50
140/140 - 0s - 2ms/step - loss: 0.1340 - recall: 0.2934 - val_loss: 0.1349 - val_recall: 0.2913
Epoch 11/50
140/140 - 0s - 2ms/step - loss: 0.1310 - recall: 0.3166 - val_loss: 0.1322 - val_recall: 0.3363
Epoch 12/50
140/140 - 0s - 2ms/step - loss: 0.1282 - recall: 0.3423 - val_loss: 0.1297 - val_recall: 0.3544
Epoch 13/50
140/140 - 0s - 2ms/step - loss: 0.1256 - recall: 0.3719 - val_loss: 0.1273 - val_recall: 0.3604
Epoch 14/50
140/140 - 0s - 2ms/step - loss: 0.1232 - recall: 0.3848 - val_loss: 0.1252 - val_recall: 0.3994
Epoch 15/50
140/140 - 0s - 2ms/step - loss: 0.1210 - recall: 0.4041 - val_loss: 0.1232 - val_recall: 0.4174
Epoch 16/50
140/140 - 0s - 2ms/step - loss: 0.1190 - recall: 0.4170 - val_loss: 0.1213 - val_recall: 0.4354
Epoch 17/50
140/140 - 0s - 2ms/step - loss: 0.1170 - recall: 0.4260 - val_loss: 0.1196 - val_recall: 0.4535
Epoch 18/50
140/140 - 0s - 2ms/step - loss: 0.1152 - recall: 0.4414 - val_loss: 0.1180 - val_recall: 0.4625
Epoch 19/50
140/140 - 0s - 2ms/step - loss: 0.1135 - recall: 0.4633 - val_loss: 0.1164 - val_recall: 0.4775
Epoch 20/50
140/140 - 0s - 2ms/step - loss: 0.1119 - recall: 0.4749 - val_loss: 0.1150 - val_recall: 0.4925
Epoch 21/50
140/140 - 0s - 2ms/step - loss: 0.1104 - recall: 0.4891 - val_loss: 0.1136 - val_recall: 0.5045
Epoch 22/50
140/140 - 0s - 2ms/step - loss: 0.1090 - recall: 0.5084 - val_loss: 0.1124 - val_recall: 0.5105
Epoch 23/50
140/140 - 0s - 2ms/step - loss: 0.1076 - recall: 0.5109 - val_loss: 0.1111 - val_recall: 0.5315
Epoch 24/50
140/140 - 0s - 2ms/step - loss: 0.1063 - recall: 0.5212 - val_loss: 0.1100 - val_recall: 0.5405
Epoch 25/50
140/140 - 0s - 2ms/step - loss: 0.1051 - recall: 0.5315 - val_loss: 0.1089 - val_recall: 0.5556
Epoch 26/50
140/140 - 1s - 4ms/step - loss: 0.1039 - recall: 0.5405 - val_loss: 0.1078 - val_recall: 0.5646
Epoch 27/50
140/140 - 1s - 5ms/step - loss: 0.1028 - recall: 0.5495 - val_loss: 0.1068 - val_recall: 0.5676
Epoch 28/50
140/140 - 0s - 2ms/step - loss: 0.1017 - recall: 0.5521 - val_loss: 0.1059 - val_recall: 0.5736
Epoch 29/50
140/140 - 0s - 2ms/step - loss: 0.1007 - recall: 0.5586 - val_loss: 0.1050 - val_recall: 0.5856
Epoch 30/50
140/140 - 0s - 2ms/step - loss: 0.0997 - recall: 0.5611 - val_loss: 0.1041 - val_recall: 0.5886
Epoch 31/50
140/140 - 0s - 2ms/step - loss: 0.0988 - recall: 0.5663 - val_loss: 0.1033 - val_recall: 0.5946
Epoch 32/50
140/140 - 0s - 2ms/step - loss: 0.0978 - recall: 0.5753 - val_loss: 0.1025 - val_recall: 0.6066
Epoch 33/50
140/140 - 0s - 2ms/step - loss: 0.0970 - recall: 0.5792 - val_loss: 0.1018 - val_recall: 0.6066
Epoch 34/50
140/140 - 0s - 2ms/step - loss: 0.0961 - recall: 0.5817 - val_loss: 0.1011 - val_recall: 0.6126
Epoch 35/50
140/140 - 0s - 2ms/step - loss: 0.0953 - recall: 0.5817 - val_loss: 0.1004 - val_recall: 0.6156
Epoch 36/50
140/140 - 0s - 2ms/step - loss: 0.0946 - recall: 0.5869 - val_loss: 0.0997 - val_recall: 0.6186
Epoch 37/50
140/140 - 0s - 2ms/step - loss: 0.0938 - recall: 0.5894 - val_loss: 0.0990 - val_recall: 0.6246
Epoch 38/50
140/140 - 0s - 2ms/step - loss: 0.0931 - recall: 0.5972 - val_loss: 0.0984 - val_recall: 0.6246
Epoch 39/50
140/140 - 0s - 2ms/step - loss: 0.0924 - recall: 0.6049 - val_loss: 0.0978 - val_recall: 0.6276
Epoch 40/50
140/140 - 0s - 2ms/step - loss: 0.0917 - recall: 0.6075 - val_loss: 0.0972 - val_recall: 0.6306
Epoch 41/50
140/140 - 0s - 2ms/step - loss: 0.0910 - recall: 0.6126 - val_loss: 0.0967 - val_recall: 0.6336
Epoch 42/50
140/140 - 0s - 2ms/step - loss: 0.0904 - recall: 0.6126 - val_loss: 0.0961 - val_recall: 0.6396
Epoch 43/50
140/140 - 0s - 2ms/step - loss: 0.0898 - recall: 0.6216 - val_loss: 0.0956 - val_recall: 0.6486
Epoch 44/50
140/140 - 0s - 2ms/step - loss: 0.0892 - recall: 0.6229 - val_loss: 0.0951 - val_recall: 0.6547
Epoch 45/50
140/140 - 0s - 2ms/step - loss: 0.0886 - recall: 0.6242 - val_loss: 0.0946 - val_recall: 0.6637
Epoch 46/50
140/140 - 0s - 2ms/step - loss: 0.0880 - recall: 0.6306 - val_loss: 0.0941 - val_recall: 0.6667
Epoch 47/50
140/140 - 0s - 2ms/step - loss: 0.0875 - recall: 0.6345 - val_loss: 0.0936 - val_recall: 0.6667
Epoch 48/50
140/140 - 0s - 2ms/step - loss: 0.0869 - recall: 0.6371 - val_loss: 0.0932 - val_recall: 0.6667
Epoch 49/50
140/140 - 0s - 2ms/step - loss: 0.0864 - recall: 0.6435 - val_loss: 0.0927 - val_recall: 0.6697
Epoch 50/50
140/140 - 0s - 2ms/step - loss: 0.0859 - recall: 0.6461 - val_loss: 0.0923 - val_recall: 0.6727
In [57]:
#plot model's loss
plot(history, 'loss')
No description has been provided for this image
In [59]:
#plot model's recall
plot(history,'recall')
No description has been provided for this image

Observations:

  • (Write down your observations here)
In [178]:
model_4_train_perf = model_performance_classification(model4, X_train, y_train)
model_4_train_perf
438/438 ━━━━━━━━━━━━━━━━━━━━ 0s 687us/step
Out[178]:
Accuracy Recall Precision F1 Score
0 0.978571 0.822092 0.966075 0.879351
In [180]:
model_4_val_perf=model_performance_classification(model4, X_val, y_val)
model_4_val_perf
188/188 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step  
Out[180]:
Accuracy Recall Precision F1 Score
0 0.978833 0.834748 0.953345 0.884007
In [61]:
#add model to our results df
results.loc[4] = [
  2, #hidden layers
  [64, 128], #neurons/layer
  ["relu", "relu"], #activation function
  epochs, #epochs
  batch_size, #batch size
  "SGD with mom", #optimizer
  [0.0001, 0.9], # learning rate, momentum
  ["Xav", "Xav", "Xav"], #weight initializer
  "-", #regularization
  history.history["loss"][-1], #train loss
  history.history["val_loss"][-1], #val loss
  history.history["recall"][-1], #train recall
  history.history["val_recall"][-1], #val recall
  round(end-start,2) #trt
  ]
results
Out[61]:
# hidden layers # neurons - hidden layer activation function - hidden layer # epochs batch size optimizer learning rate, momentum weight initializer regularization train loss validation loss train recall validation recall time (secs)
0 1 64 relu 50 100 SGD [0.001, -] - - 0.051616 0.066721 0.832690 0.804805 14.10
1 2 [64, 128] [relu, relu] 50 100 SGD [0.001, -] [Xav, Xav, Xav] - 0.043756 0.060069 0.858430 0.840841 14.28
2 2 [64, 64] [relu, tanh] 50 100 SGD [0.001, -] [Xav, Xav, Xav] - 0.042877 0.054599 0.848134 0.828829 14.53
3 2 [64, 128] [relu, relu] 50 100 SGD with mom [0.001, 0.9] [Xav, Xav, Xav] - 0.019997 0.058326 0.925354 0.870871 16.27
4 2 [64, 128] [relu, relu] 50 100 SGD with mom [0.0001, 0.9] [Xav, Xav, Xav] - 0.085880 0.092282 0.646075 0.672673 15.04

Model 5¶

Plan:

  • Two hidden layers -- 64, 128
  • activation function -- relu, relu
  • Adam optimizer
In [62]:
#initialize model

#clear keras session
tf.keras.backend.clear_session()

#initialize neural network
model5 = Sequential()

#hidden layer
model5.add(Dense(64, activation = 'relu', input_shape = (X_train.shape[1],)))

#hidden layer
model5.add(Dense(128, activation = 'relu'))

#output layer -- one node for binary target variable, and sigmoid for a binary classification problem
model5.add(Dense(1, activation = 'sigmoid'))
In [63]:
model5.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense (Dense)                   │ (None, 64)             │         2,624 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 128)            │         8,320 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_2 (Dense)                 │ (None, 1)              │           129 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 11,073 (43.25 KB)
 Trainable params: 11,073 (43.25 KB)
 Non-trainable params: 0 (0.00 B)
In [64]:
#defining optimizer
optimizer = keras.optimizers.Adam()

#compile model, set optimizer and loss function -- binary crossentropy for binary target variable, with recall as metric
model5.compile(optimizer = optimizer, loss = 'binary_crossentropy', metrics = ['recall'])
In [65]:
#defining batch size and epochs
batch_size = 100
epochs = 50

#fitting model
start = time.time()
history = model5.fit(X_train, y_train, validation_data = (X_val, y_val), batch_size = batch_size, epochs = epochs, verbose = 2)
end=time.time()
Epoch 1/50
140/140 - 1s - 8ms/step - loss: 0.1239 - recall: 0.5071 - val_loss: 0.0786 - val_recall: 0.7658
Epoch 2/50
140/140 - 0s - 2ms/step - loss: 0.0620 - recall: 0.8005 - val_loss: 0.0637 - val_recall: 0.8138
Epoch 3/50
140/140 - 0s - 2ms/step - loss: 0.0508 - recall: 0.8520 - val_loss: 0.0600 - val_recall: 0.8318
Epoch 4/50
140/140 - 0s - 2ms/step - loss: 0.0456 - recall: 0.8700 - val_loss: 0.0585 - val_recall: 0.8318
Epoch 5/50
140/140 - 0s - 2ms/step - loss: 0.0421 - recall: 0.8739 - val_loss: 0.0579 - val_recall: 0.8348
Epoch 6/50
140/140 - 0s - 2ms/step - loss: 0.0396 - recall: 0.8829 - val_loss: 0.0577 - val_recall: 0.8408
Epoch 7/50
140/140 - 0s - 2ms/step - loss: 0.0376 - recall: 0.8893 - val_loss: 0.0580 - val_recall: 0.8438
Epoch 8/50
140/140 - 0s - 2ms/step - loss: 0.0360 - recall: 0.8932 - val_loss: 0.0579 - val_recall: 0.8468
Epoch 9/50
140/140 - 0s - 2ms/step - loss: 0.0347 - recall: 0.8945 - val_loss: 0.0583 - val_recall: 0.8529
Epoch 10/50
140/140 - 0s - 2ms/step - loss: 0.0334 - recall: 0.8970 - val_loss: 0.0586 - val_recall: 0.8619
Epoch 11/50
140/140 - 0s - 2ms/step - loss: 0.0323 - recall: 0.8996 - val_loss: 0.0590 - val_recall: 0.8619
Epoch 12/50
140/140 - 0s - 2ms/step - loss: 0.0313 - recall: 0.9048 - val_loss: 0.0591 - val_recall: 0.8649
Epoch 13/50
140/140 - 0s - 2ms/step - loss: 0.0303 - recall: 0.9048 - val_loss: 0.0597 - val_recall: 0.8619
Epoch 14/50
140/140 - 0s - 2ms/step - loss: 0.0294 - recall: 0.9060 - val_loss: 0.0602 - val_recall: 0.8619
Epoch 15/50
140/140 - 0s - 2ms/step - loss: 0.0283 - recall: 0.9086 - val_loss: 0.0606 - val_recall: 0.8589
Epoch 16/50
140/140 - 0s - 2ms/step - loss: 0.0274 - recall: 0.9073 - val_loss: 0.0615 - val_recall: 0.8529
Epoch 17/50
140/140 - 0s - 2ms/step - loss: 0.0266 - recall: 0.9073 - val_loss: 0.0621 - val_recall: 0.8529
Epoch 18/50
140/140 - 0s - 2ms/step - loss: 0.0257 - recall: 0.9086 - val_loss: 0.0630 - val_recall: 0.8529
Epoch 19/50
140/140 - 0s - 2ms/step - loss: 0.0249 - recall: 0.9125 - val_loss: 0.0638 - val_recall: 0.8529
Epoch 20/50
140/140 - 0s - 2ms/step - loss: 0.0242 - recall: 0.9125 - val_loss: 0.0637 - val_recall: 0.8589
Epoch 21/50
140/140 - 0s - 2ms/step - loss: 0.0232 - recall: 0.9138 - val_loss: 0.0652 - val_recall: 0.8589
Epoch 22/50
140/140 - 0s - 2ms/step - loss: 0.0226 - recall: 0.9138 - val_loss: 0.0653 - val_recall: 0.8649
Epoch 23/50
140/140 - 0s - 2ms/step - loss: 0.0216 - recall: 0.9189 - val_loss: 0.0662 - val_recall: 0.8649
Epoch 24/50
140/140 - 0s - 2ms/step - loss: 0.0209 - recall: 0.9202 - val_loss: 0.0667 - val_recall: 0.8679
Epoch 25/50
140/140 - 0s - 3ms/step - loss: 0.0203 - recall: 0.9215 - val_loss: 0.0676 - val_recall: 0.8649
Epoch 26/50
140/140 - 1s - 4ms/step - loss: 0.0196 - recall: 0.9241 - val_loss: 0.0688 - val_recall: 0.8679
Epoch 27/50
140/140 - 0s - 3ms/step - loss: 0.0192 - recall: 0.9254 - val_loss: 0.0694 - val_recall: 0.8679
Epoch 28/50
140/140 - 0s - 3ms/step - loss: 0.0185 - recall: 0.9254 - val_loss: 0.0710 - val_recall: 0.8679
Epoch 29/50
140/140 - 0s - 2ms/step - loss: 0.0176 - recall: 0.9279 - val_loss: 0.0731 - val_recall: 0.8649
Epoch 30/50
140/140 - 0s - 3ms/step - loss: 0.0172 - recall: 0.9305 - val_loss: 0.0743 - val_recall: 0.8619
Epoch 31/50
140/140 - 0s - 2ms/step - loss: 0.0167 - recall: 0.9305 - val_loss: 0.0746 - val_recall: 0.8679
Epoch 32/50
140/140 - 0s - 2ms/step - loss: 0.0160 - recall: 0.9331 - val_loss: 0.0753 - val_recall: 0.8709
Epoch 33/50
140/140 - 0s - 2ms/step - loss: 0.0153 - recall: 0.9318 - val_loss: 0.0770 - val_recall: 0.8709
Epoch 34/50
140/140 - 0s - 2ms/step - loss: 0.0150 - recall: 0.9382 - val_loss: 0.0777 - val_recall: 0.8649
Epoch 35/50
140/140 - 0s - 2ms/step - loss: 0.0143 - recall: 0.9369 - val_loss: 0.0798 - val_recall: 0.8619
Epoch 36/50
140/140 - 0s - 2ms/step - loss: 0.0135 - recall: 0.9382 - val_loss: 0.0792 - val_recall: 0.8619
Epoch 37/50
140/140 - 0s - 2ms/step - loss: 0.0132 - recall: 0.9408 - val_loss: 0.0818 - val_recall: 0.8619
Epoch 38/50
140/140 - 0s - 2ms/step - loss: 0.0129 - recall: 0.9459 - val_loss: 0.0837 - val_recall: 0.8529
Epoch 39/50
140/140 - 0s - 2ms/step - loss: 0.0125 - recall: 0.9434 - val_loss: 0.0837 - val_recall: 0.8619
Epoch 40/50
140/140 - 0s - 2ms/step - loss: 0.0119 - recall: 0.9459 - val_loss: 0.0885 - val_recall: 0.8378
Epoch 41/50
140/140 - 0s - 3ms/step - loss: 0.0123 - recall: 0.9434 - val_loss: 0.0912 - val_recall: 0.8348
Epoch 42/50
140/140 - 0s - 2ms/step - loss: 0.0123 - recall: 0.9459 - val_loss: 0.0911 - val_recall: 0.8408
Epoch 43/50
140/140 - 0s - 2ms/step - loss: 0.0114 - recall: 0.9550 - val_loss: 0.0902 - val_recall: 0.8559
Epoch 44/50
140/140 - 0s - 2ms/step - loss: 0.0107 - recall: 0.9537 - val_loss: 0.0957 - val_recall: 0.8348
Epoch 45/50
140/140 - 0s - 2ms/step - loss: 0.0102 - recall: 0.9562 - val_loss: 0.1020 - val_recall: 0.8138
Epoch 46/50
140/140 - 0s - 2ms/step - loss: 0.0119 - recall: 0.9459 - val_loss: 0.0968 - val_recall: 0.8709
Epoch 47/50
140/140 - 0s - 2ms/step - loss: 0.0151 - recall: 0.9344 - val_loss: 0.1056 - val_recall: 0.8078
Epoch 48/50
140/140 - 0s - 2ms/step - loss: 0.0118 - recall: 0.9421 - val_loss: 0.0946 - val_recall: 0.8408
Epoch 49/50
140/140 - 0s - 2ms/step - loss: 0.0091 - recall: 0.9537 - val_loss: 0.0963 - val_recall: 0.8709
Epoch 50/50
140/140 - 0s - 2ms/step - loss: 0.0070 - recall: 0.9640 - val_loss: 0.0995 - val_recall: 0.8679
In [66]:
#plot model's loss
plot(history, 'loss')
No description has been provided for this image
In [67]:
#plot model's recall
plot(history, 'recall')
No description has been provided for this image
  • There is a big gap between training and validation loss-- validation drops until about the 4th epoch, then increases gradually before oscillating and increasing even more at 35 epochs
  • Meantime, training loss keeps falling all the way to 50 epochs
  • For recall, training increases gradually until about 4 epochs, then continues increasing gradually
  • Validation recall increases greatly until about 4 epochs, then has slowing increases before oscillating after 10 epochs, then dropping a bit until 42 epochs and increasing again
  • The large gaps and zig-zagging may be signs of overfitting
In [68]:
#add model to our results df
results.loc[5] = [
  2, #hidden layers
  [64, 128], #neurons/layer
  ["relu", "relu"], #activation function
  epochs, #epochs
  batch_size, #batch size
  "Adam", #optimizer
  [0.001,"-"], # learning rate, momentum
  ["Xav", "Xav", "Xav"], #weight initializer
  "-", #regularization
  history.history["loss"][-1], #train loss
  history.history["val_loss"][-1], #val loss
  history.history["recall"][-1], #train recall
  history.history["val_recall"][-1], #val recall
  round(end-start,2) #trt
  ]
results
Out[68]:
# hidden layers # neurons - hidden layer activation function - hidden layer # epochs batch size optimizer learning rate, momentum weight initializer regularization train loss validation loss train recall validation recall time (secs)
0 1 64 relu 50 100 SGD [0.001, -] - - 0.051616 0.066721 0.832690 0.804805 14.10
1 2 [64, 128] [relu, relu] 50 100 SGD [0.001, -] [Xav, Xav, Xav] - 0.043756 0.060069 0.858430 0.840841 14.28
2 2 [64, 64] [relu, tanh] 50 100 SGD [0.001, -] [Xav, Xav, Xav] - 0.042877 0.054599 0.848134 0.828829 14.53
3 2 [64, 128] [relu, relu] 50 100 SGD with mom [0.001, 0.9] [Xav, Xav, Xav] - 0.019997 0.058326 0.925354 0.870871 16.27
4 2 [64, 128] [relu, relu] 50 100 SGD with mom [0.0001, 0.9] [Xav, Xav, Xav] - 0.085880 0.092282 0.646075 0.672673 15.04
5 2 [64, 128] [relu, relu] 50 100 Adam [0.001, -] [Xav, Xav, Xav] - 0.007018 0.099477 0.963964 0.867868 15.84
In [181]:
model_5_train_perf = model_performance_classification(model5, X_train, y_train)
model_5_train_perf
438/438 ━━━━━━━━━━━━━━━━━━━━ 0s 730us/step
Out[181]:
Accuracy Recall Precision F1 Score
0 0.997786 0.980657 0.9982 0.989251
In [182]:
model_5_val_perf = model_performance_classification(model5, X_val, y_val)
model_5_val_perf
188/188 ━━━━━━━━━━━━━━━━━━━━ 0s 716us/step
Out[182]:
Accuracy Recall Precision F1 Score
0 0.992 0.933581 0.989319 0.959551

Model 6¶

Plan:

  • Two hidden layers -- 64, 128
  • activation function -- relu, relu
  • Adam optimizer
  • Reduced learning rate of 1e-4 and increase epochs to 100
In [70]:
#initialize model

#clear keras session
tf.keras.backend.clear_session()

#initialize neural network
model6 = Sequential()

#hidden layer
model6.add(Dense(64, activation = 'relu', input_shape = (X_train.shape[1],)))

#hidden layer
model6.add(Dense(128, activation = 'relu'))

#output layer -- one node for binary target variable, and sigmoid for a binary classification problem
model6.add(Dense(1, activation = 'sigmoid'))
In [71]:
#looking at model details
model6.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense (Dense)                   │ (None, 64)             │         2,624 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 128)            │         8,320 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_2 (Dense)                 │ (None, 1)              │           129 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 11,073 (43.25 KB)
 Trainable params: 11,073 (43.25 KB)
 Non-trainable params: 0 (0.00 B)
In [72]:
#defining optimizer
lr = 1e-4
optimizer = keras.optimizers.Adam(learning_rate = lr)

#compile model, set optimizer and loss function -- binary crossentropy for binary target variable, with recall as metric
model6.compile(optimizer = optimizer, loss = 'binary_crossentropy', metrics = ['recall'])
In [73]:
#defining batch size and epochs
batch_size = 100
epochs = 100

#fitting model
start = time.time()
history = model6.fit(X_train, y_train, validation_data = (X_val, y_val), batch_size = batch_size, epochs = epochs, verbose = 2)
end=time.time()
Epoch 1/100
140/140 - 1s - 8ms/step - loss: 0.4065 - recall: 0.1596 - val_loss: 0.1999 - val_recall: 0.0390
Epoch 2/100
140/140 - 0s - 2ms/step - loss: 0.1727 - recall: 0.1429 - val_loss: 0.1507 - val_recall: 0.2583
Epoch 3/100
140/140 - 0s - 2ms/step - loss: 0.1360 - recall: 0.3192 - val_loss: 0.1256 - val_recall: 0.4084
Epoch 4/100
140/140 - 0s - 3ms/step - loss: 0.1140 - recall: 0.4556 - val_loss: 0.1095 - val_recall: 0.5195
Epoch 5/100
140/140 - 0s - 3ms/step - loss: 0.0993 - recall: 0.5444 - val_loss: 0.0984 - val_recall: 0.6156
Epoch 6/100
140/140 - 0s - 3ms/step - loss: 0.0889 - recall: 0.6165 - val_loss: 0.0906 - val_recall: 0.6607
Epoch 7/100
140/140 - 1s - 5ms/step - loss: 0.0814 - recall: 0.6602 - val_loss: 0.0850 - val_recall: 0.6817
Epoch 8/100
140/140 - 0s - 3ms/step - loss: 0.0758 - recall: 0.7027 - val_loss: 0.0809 - val_recall: 0.7147
Epoch 9/100
140/140 - 0s - 2ms/step - loss: 0.0715 - recall: 0.7246 - val_loss: 0.0777 - val_recall: 0.7477
Epoch 10/100
140/140 - 0s - 2ms/step - loss: 0.0680 - recall: 0.7503 - val_loss: 0.0751 - val_recall: 0.7568
Epoch 11/100
140/140 - 0s - 2ms/step - loss: 0.0651 - recall: 0.7606 - val_loss: 0.0729 - val_recall: 0.7658
Epoch 12/100
140/140 - 0s - 2ms/step - loss: 0.0627 - recall: 0.7773 - val_loss: 0.0711 - val_recall: 0.7748
Epoch 13/100
140/140 - 0s - 2ms/step - loss: 0.0606 - recall: 0.7902 - val_loss: 0.0695 - val_recall: 0.7898
Epoch 14/100
140/140 - 0s - 2ms/step - loss: 0.0587 - recall: 0.8031 - val_loss: 0.0681 - val_recall: 0.8018
Epoch 15/100
140/140 - 0s - 2ms/step - loss: 0.0571 - recall: 0.8121 - val_loss: 0.0669 - val_recall: 0.8108
Epoch 16/100
140/140 - 0s - 2ms/step - loss: 0.0556 - recall: 0.8198 - val_loss: 0.0658 - val_recall: 0.8138
Epoch 17/100
140/140 - 0s - 2ms/step - loss: 0.0543 - recall: 0.8263 - val_loss: 0.0648 - val_recall: 0.8228
Epoch 18/100
140/140 - 0s - 2ms/step - loss: 0.0531 - recall: 0.8366 - val_loss: 0.0639 - val_recall: 0.8318
Epoch 19/100
140/140 - 0s - 2ms/step - loss: 0.0520 - recall: 0.8391 - val_loss: 0.0631 - val_recall: 0.8318
Epoch 20/100
140/140 - 0s - 2ms/step - loss: 0.0509 - recall: 0.8430 - val_loss: 0.0624 - val_recall: 0.8318
Epoch 21/100
140/140 - 0s - 2ms/step - loss: 0.0500 - recall: 0.8468 - val_loss: 0.0617 - val_recall: 0.8318
Epoch 22/100
140/140 - 0s - 2ms/step - loss: 0.0491 - recall: 0.8494 - val_loss: 0.0611 - val_recall: 0.8318
Epoch 23/100
140/140 - 0s - 2ms/step - loss: 0.0482 - recall: 0.8520 - val_loss: 0.0605 - val_recall: 0.8378
Epoch 24/100
140/140 - 0s - 2ms/step - loss: 0.0474 - recall: 0.8546 - val_loss: 0.0600 - val_recall: 0.8378
Epoch 25/100
140/140 - 0s - 2ms/step - loss: 0.0466 - recall: 0.8571 - val_loss: 0.0595 - val_recall: 0.8378
Epoch 26/100
140/140 - 0s - 2ms/step - loss: 0.0458 - recall: 0.8597 - val_loss: 0.0591 - val_recall: 0.8378
Epoch 27/100
140/140 - 0s - 2ms/step - loss: 0.0451 - recall: 0.8610 - val_loss: 0.0586 - val_recall: 0.8408
Epoch 28/100
140/140 - 0s - 2ms/step - loss: 0.0444 - recall: 0.8623 - val_loss: 0.0583 - val_recall: 0.8408
Epoch 29/100
140/140 - 0s - 2ms/step - loss: 0.0437 - recall: 0.8662 - val_loss: 0.0580 - val_recall: 0.8408
Epoch 30/100
140/140 - 0s - 2ms/step - loss: 0.0431 - recall: 0.8687 - val_loss: 0.0577 - val_recall: 0.8408
Epoch 31/100
140/140 - 0s - 2ms/step - loss: 0.0425 - recall: 0.8700 - val_loss: 0.0574 - val_recall: 0.8408
Epoch 32/100
140/140 - 0s - 2ms/step - loss: 0.0419 - recall: 0.8726 - val_loss: 0.0571 - val_recall: 0.8438
Epoch 33/100
140/140 - 0s - 2ms/step - loss: 0.0413 - recall: 0.8726 - val_loss: 0.0568 - val_recall: 0.8438
Epoch 34/100
140/140 - 0s - 2ms/step - loss: 0.0407 - recall: 0.8739 - val_loss: 0.0566 - val_recall: 0.8468
Epoch 35/100
140/140 - 0s - 2ms/step - loss: 0.0402 - recall: 0.8752 - val_loss: 0.0564 - val_recall: 0.8468
Epoch 36/100
140/140 - 0s - 2ms/step - loss: 0.0397 - recall: 0.8752 - val_loss: 0.0562 - val_recall: 0.8498
Epoch 37/100
140/140 - 0s - 2ms/step - loss: 0.0392 - recall: 0.8777 - val_loss: 0.0560 - val_recall: 0.8498
Epoch 38/100
140/140 - 0s - 2ms/step - loss: 0.0387 - recall: 0.8790 - val_loss: 0.0559 - val_recall: 0.8498
Epoch 39/100
140/140 - 0s - 2ms/step - loss: 0.0383 - recall: 0.8803 - val_loss: 0.0558 - val_recall: 0.8498
Epoch 40/100
140/140 - 0s - 2ms/step - loss: 0.0378 - recall: 0.8803 - val_loss: 0.0556 - val_recall: 0.8498
Epoch 41/100
140/140 - 0s - 2ms/step - loss: 0.0374 - recall: 0.8816 - val_loss: 0.0555 - val_recall: 0.8498
Epoch 42/100
140/140 - 0s - 2ms/step - loss: 0.0369 - recall: 0.8829 - val_loss: 0.0554 - val_recall: 0.8498
Epoch 43/100
140/140 - 0s - 2ms/step - loss: 0.0365 - recall: 0.8842 - val_loss: 0.0553 - val_recall: 0.8498
Epoch 44/100
140/140 - 0s - 3ms/step - loss: 0.0361 - recall: 0.8867 - val_loss: 0.0553 - val_recall: 0.8498
Epoch 45/100
140/140 - 1s - 4ms/step - loss: 0.0357 - recall: 0.8893 - val_loss: 0.0552 - val_recall: 0.8498
Epoch 46/100
140/140 - 1s - 5ms/step - loss: 0.0354 - recall: 0.8906 - val_loss: 0.0552 - val_recall: 0.8529
Epoch 47/100
140/140 - 0s - 2ms/step - loss: 0.0350 - recall: 0.8906 - val_loss: 0.0551 - val_recall: 0.8529
Epoch 48/100
140/140 - 0s - 2ms/step - loss: 0.0346 - recall: 0.8932 - val_loss: 0.0551 - val_recall: 0.8529
Epoch 49/100
140/140 - 0s - 2ms/step - loss: 0.0343 - recall: 0.8970 - val_loss: 0.0550 - val_recall: 0.8529
Epoch 50/100
140/140 - 0s - 2ms/step - loss: 0.0340 - recall: 0.8970 - val_loss: 0.0551 - val_recall: 0.8529
Epoch 51/100
140/140 - 0s - 2ms/step - loss: 0.0336 - recall: 0.8970 - val_loss: 0.0550 - val_recall: 0.8559
Epoch 52/100
140/140 - 0s - 2ms/step - loss: 0.0333 - recall: 0.8970 - val_loss: 0.0550 - val_recall: 0.8559
Epoch 53/100
140/140 - 0s - 2ms/step - loss: 0.0330 - recall: 0.8970 - val_loss: 0.0550 - val_recall: 0.8559
Epoch 54/100
140/140 - 0s - 2ms/step - loss: 0.0327 - recall: 0.8996 - val_loss: 0.0550 - val_recall: 0.8559
Epoch 55/100
140/140 - 0s - 2ms/step - loss: 0.0324 - recall: 0.8996 - val_loss: 0.0550 - val_recall: 0.8559
Epoch 56/100
140/140 - 0s - 2ms/step - loss: 0.0321 - recall: 0.9009 - val_loss: 0.0550 - val_recall: 0.8559
Epoch 57/100
140/140 - 0s - 2ms/step - loss: 0.0318 - recall: 0.9009 - val_loss: 0.0550 - val_recall: 0.8559
Epoch 58/100
140/140 - 0s - 2ms/step - loss: 0.0315 - recall: 0.9009 - val_loss: 0.0550 - val_recall: 0.8559
Epoch 59/100
140/140 - 0s - 2ms/step - loss: 0.0312 - recall: 0.9035 - val_loss: 0.0550 - val_recall: 0.8559
Epoch 60/100
140/140 - 0s - 2ms/step - loss: 0.0310 - recall: 0.9035 - val_loss: 0.0550 - val_recall: 0.8559
Epoch 61/100
140/140 - 0s - 2ms/step - loss: 0.0307 - recall: 0.9035 - val_loss: 0.0550 - val_recall: 0.8559
Epoch 62/100
140/140 - 0s - 2ms/step - loss: 0.0305 - recall: 0.9035 - val_loss: 0.0550 - val_recall: 0.8559
Epoch 63/100
140/140 - 0s - 2ms/step - loss: 0.0302 - recall: 0.9035 - val_loss: 0.0550 - val_recall: 0.8559
Epoch 64/100
140/140 - 0s - 2ms/step - loss: 0.0300 - recall: 0.9035 - val_loss: 0.0550 - val_recall: 0.8559
Epoch 65/100
140/140 - 0s - 2ms/step - loss: 0.0297 - recall: 0.9048 - val_loss: 0.0551 - val_recall: 0.8559
Epoch 66/100
140/140 - 0s - 2ms/step - loss: 0.0295 - recall: 0.9048 - val_loss: 0.0551 - val_recall: 0.8559
Epoch 67/100
140/140 - 0s - 2ms/step - loss: 0.0292 - recall: 0.9048 - val_loss: 0.0552 - val_recall: 0.8559
Epoch 68/100
140/140 - 0s - 2ms/step - loss: 0.0290 - recall: 0.9048 - val_loss: 0.0552 - val_recall: 0.8559
Epoch 69/100
140/140 - 0s - 2ms/step - loss: 0.0288 - recall: 0.9048 - val_loss: 0.0552 - val_recall: 0.8559
Epoch 70/100
140/140 - 0s - 2ms/step - loss: 0.0286 - recall: 0.9048 - val_loss: 0.0552 - val_recall: 0.8559
Epoch 71/100
140/140 - 0s - 2ms/step - loss: 0.0283 - recall: 0.9048 - val_loss: 0.0553 - val_recall: 0.8559
Epoch 72/100
140/140 - 0s - 2ms/step - loss: 0.0281 - recall: 0.9048 - val_loss: 0.0553 - val_recall: 0.8559
Epoch 73/100
140/140 - 0s - 2ms/step - loss: 0.0279 - recall: 0.9048 - val_loss: 0.0553 - val_recall: 0.8559
Epoch 74/100
140/140 - 0s - 2ms/step - loss: 0.0277 - recall: 0.9048 - val_loss: 0.0554 - val_recall: 0.8559
Epoch 75/100
140/140 - 0s - 2ms/step - loss: 0.0275 - recall: 0.9060 - val_loss: 0.0554 - val_recall: 0.8559
Epoch 76/100
140/140 - 0s - 2ms/step - loss: 0.0273 - recall: 0.9073 - val_loss: 0.0554 - val_recall: 0.8559
Epoch 77/100
140/140 - 0s - 2ms/step - loss: 0.0271 - recall: 0.9073 - val_loss: 0.0554 - val_recall: 0.8589
Epoch 78/100
140/140 - 0s - 2ms/step - loss: 0.0269 - recall: 0.9099 - val_loss: 0.0555 - val_recall: 0.8589
Epoch 79/100
140/140 - 0s - 2ms/step - loss: 0.0267 - recall: 0.9112 - val_loss: 0.0555 - val_recall: 0.8589
Epoch 80/100
140/140 - 0s - 2ms/step - loss: 0.0265 - recall: 0.9112 - val_loss: 0.0555 - val_recall: 0.8589
Epoch 81/100
140/140 - 0s - 2ms/step - loss: 0.0263 - recall: 0.9112 - val_loss: 0.0556 - val_recall: 0.8589
Epoch 82/100
140/140 - 0s - 2ms/step - loss: 0.0261 - recall: 0.9112 - val_loss: 0.0556 - val_recall: 0.8589
Epoch 83/100
140/140 - 0s - 2ms/step - loss: 0.0259 - recall: 0.9112 - val_loss: 0.0557 - val_recall: 0.8619
Epoch 84/100
140/140 - 1s - 5ms/step - loss: 0.0257 - recall: 0.9112 - val_loss: 0.0557 - val_recall: 0.8619
Epoch 85/100
140/140 - 0s - 3ms/step - loss: 0.0255 - recall: 0.9112 - val_loss: 0.0557 - val_recall: 0.8619
Epoch 86/100
140/140 - 1s - 4ms/step - loss: 0.0254 - recall: 0.9112 - val_loss: 0.0558 - val_recall: 0.8619
Epoch 87/100
140/140 - 0s - 2ms/step - loss: 0.0252 - recall: 0.9112 - val_loss: 0.0558 - val_recall: 0.8619
Epoch 88/100
140/140 - 0s - 2ms/step - loss: 0.0250 - recall: 0.9112 - val_loss: 0.0559 - val_recall: 0.8619
Epoch 89/100
140/140 - 0s - 2ms/step - loss: 0.0248 - recall: 0.9112 - val_loss: 0.0559 - val_recall: 0.8619
Epoch 90/100
140/140 - 0s - 2ms/step - loss: 0.0246 - recall: 0.9112 - val_loss: 0.0559 - val_recall: 0.8619
Epoch 91/100
140/140 - 0s - 2ms/step - loss: 0.0245 - recall: 0.9112 - val_loss: 0.0560 - val_recall: 0.8619
Epoch 92/100
140/140 - 0s - 2ms/step - loss: 0.0243 - recall: 0.9112 - val_loss: 0.0561 - val_recall: 0.8619
Epoch 93/100
140/140 - 0s - 2ms/step - loss: 0.0241 - recall: 0.9099 - val_loss: 0.0561 - val_recall: 0.8619
Epoch 94/100
140/140 - 0s - 2ms/step - loss: 0.0240 - recall: 0.9099 - val_loss: 0.0562 - val_recall: 0.8619
Epoch 95/100
140/140 - 0s - 2ms/step - loss: 0.0238 - recall: 0.9099 - val_loss: 0.0563 - val_recall: 0.8619
Epoch 96/100
140/140 - 0s - 2ms/step - loss: 0.0236 - recall: 0.9099 - val_loss: 0.0563 - val_recall: 0.8619
Epoch 97/100
140/140 - 0s - 2ms/step - loss: 0.0235 - recall: 0.9099 - val_loss: 0.0564 - val_recall: 0.8619
Epoch 98/100
140/140 - 0s - 2ms/step - loss: 0.0233 - recall: 0.9099 - val_loss: 0.0565 - val_recall: 0.8619
Epoch 99/100
140/140 - 0s - 2ms/step - loss: 0.0231 - recall: 0.9099 - val_loss: 0.0566 - val_recall: 0.8619
Epoch 100/100
140/140 - 0s - 2ms/step - loss: 0.0230 - recall: 0.9099 - val_loss: 0.0567 - val_recall: 0.8619
In [74]:
#plot model's loss
plot(history, 'loss')
No description has been provided for this image
In [75]:
#plot model's recall
plot(history, 'recall')
No description has been provided for this image
In [183]:
model_6_train_perf = model_performance_classification(model6, X_train, y_train)
model_6_train_perf
438/438 ━━━━━━━━━━━━━━━━━━━━ 0s 700us/step
Out[183]:
Accuracy Recall Precision F1 Score
0 0.994786 0.956053 0.99394 0.974143
In [184]:
model_6_val_perf = model_performance_classification(model6, X_val, y_val)
model_6_val_perf
188/188 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step  
Out[184]:
Accuracy Recall Precision F1 Score
0 0.990667 0.930049 0.979132 0.953093

Model 7¶

Plan:

  • Two hidden layers -- 64, 128
  • activation function -- relu, relu
  • Adam optimizer
  • Reduced learning rate of 1e-4 and increase epochs to 100
  • Dropout of 0.2
In [78]:
#setting the dropout rate
dropout_rate = 0.2
In [79]:
#initialize model

#clear keras session
tf.keras.backend.clear_session()

#initialize neural network
model7 = Sequential()

#hidden layer
model7.add(Dense(64, activation = 'relu', input_shape = (X_train.shape[1],)))

#dropout layer
model7.add(Dropout(dropout_rate))

#hidden layer
model7.add(Dense(128, activation = 'relu'))

#output layer -- one node for binary target variable, and sigmoid for a binary classification problem
model7.add(Dense(1, activation = 'sigmoid'))
In [80]:
#looking at model details
model7.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense (Dense)                   │ (None, 64)             │         2,624 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout (Dropout)               │ (None, 64)             │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 128)            │         8,320 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_2 (Dense)                 │ (None, 1)              │           129 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 11,073 (43.25 KB)
 Trainable params: 11,073 (43.25 KB)
 Non-trainable params: 0 (0.00 B)
In [81]:
#defining optimizer
lr = 1e-4
optimizer = keras.optimizers.Adam(learning_rate = lr)

#compile model, set optimizer and loss function -- binary crossentropy for binary target variable, with recall as metric
model7.compile(optimizer = optimizer, loss = 'binary_crossentropy', metrics = ['recall'])
In [82]:
#defining batch size and epochs
batch_size = 100
epochs = 100

#fitting model
start = time.time()
history = model7.fit(X_train, y_train, validation_data = (X_val, y_val), batch_size = batch_size, epochs = epochs, verbose = 2)
end=time.time()
Epoch 1/100
140/140 - 1s - 9ms/step - loss: 0.2993 - recall: 0.1236 - val_loss: 0.1716 - val_recall: 0.1111
Epoch 2/100
140/140 - 0s - 2ms/step - loss: 0.1630 - recall: 0.2046 - val_loss: 0.1365 - val_recall: 0.3063
Epoch 3/100
140/140 - 0s - 2ms/step - loss: 0.1385 - recall: 0.3295 - val_loss: 0.1195 - val_recall: 0.3964
Epoch 4/100
140/140 - 0s - 2ms/step - loss: 0.1243 - recall: 0.3977 - val_loss: 0.1071 - val_recall: 0.4715
Epoch 5/100
140/140 - 0s - 2ms/step - loss: 0.1100 - recall: 0.4672 - val_loss: 0.0980 - val_recall: 0.5676
Epoch 6/100
140/140 - 0s - 2ms/step - loss: 0.1018 - recall: 0.5380 - val_loss: 0.0913 - val_recall: 0.6396
Epoch 7/100
140/140 - 0s - 2ms/step - loss: 0.0969 - recall: 0.5882 - val_loss: 0.0861 - val_recall: 0.6757
Epoch 8/100
140/140 - 0s - 2ms/step - loss: 0.0905 - recall: 0.6332 - val_loss: 0.0821 - val_recall: 0.7027
Epoch 9/100
140/140 - 0s - 2ms/step - loss: 0.0855 - recall: 0.6589 - val_loss: 0.0793 - val_recall: 0.7327
Epoch 10/100
140/140 - 0s - 2ms/step - loss: 0.0824 - recall: 0.6744 - val_loss: 0.0768 - val_recall: 0.7447
Epoch 11/100
140/140 - 0s - 2ms/step - loss: 0.0780 - recall: 0.7027 - val_loss: 0.0744 - val_recall: 0.7628
Epoch 12/100
140/140 - 0s - 2ms/step - loss: 0.0750 - recall: 0.7156 - val_loss: 0.0725 - val_recall: 0.7808
Epoch 13/100
140/140 - 0s - 2ms/step - loss: 0.0730 - recall: 0.7272 - val_loss: 0.0710 - val_recall: 0.7868
Epoch 14/100
140/140 - 0s - 2ms/step - loss: 0.0717 - recall: 0.7336 - val_loss: 0.0695 - val_recall: 0.7988
Epoch 15/100
140/140 - 0s - 2ms/step - loss: 0.0694 - recall: 0.7606 - val_loss: 0.0684 - val_recall: 0.8048
Epoch 16/100
140/140 - 0s - 2ms/step - loss: 0.0681 - recall: 0.7580 - val_loss: 0.0674 - val_recall: 0.8168
Epoch 17/100
140/140 - 0s - 2ms/step - loss: 0.0664 - recall: 0.7748 - val_loss: 0.0665 - val_recall: 0.8198
Epoch 18/100
140/140 - 0s - 2ms/step - loss: 0.0661 - recall: 0.7761 - val_loss: 0.0657 - val_recall: 0.8108
Epoch 19/100
140/140 - 0s - 2ms/step - loss: 0.0637 - recall: 0.7851 - val_loss: 0.0648 - val_recall: 0.8138
Epoch 20/100
140/140 - 0s - 2ms/step - loss: 0.0630 - recall: 0.7902 - val_loss: 0.0641 - val_recall: 0.8168
Epoch 21/100
140/140 - 0s - 2ms/step - loss: 0.0622 - recall: 0.7928 - val_loss: 0.0634 - val_recall: 0.8318
Epoch 22/100
140/140 - 0s - 2ms/step - loss: 0.0617 - recall: 0.7941 - val_loss: 0.0628 - val_recall: 0.8288
Epoch 23/100
140/140 - 0s - 2ms/step - loss: 0.0590 - recall: 0.8057 - val_loss: 0.0618 - val_recall: 0.8318
Epoch 24/100
140/140 - 0s - 2ms/step - loss: 0.0573 - recall: 0.8121 - val_loss: 0.0614 - val_recall: 0.8228
Epoch 25/100
140/140 - 0s - 3ms/step - loss: 0.0579 - recall: 0.8057 - val_loss: 0.0611 - val_recall: 0.8378
Epoch 26/100
140/140 - 0s - 3ms/step - loss: 0.0566 - recall: 0.8134 - val_loss: 0.0604 - val_recall: 0.8438
Epoch 27/100
140/140 - 0s - 3ms/step - loss: 0.0556 - recall: 0.8185 - val_loss: 0.0602 - val_recall: 0.8438
Epoch 28/100
140/140 - 1s - 5ms/step - loss: 0.0562 - recall: 0.8095 - val_loss: 0.0599 - val_recall: 0.8408
Epoch 29/100
140/140 - 0s - 3ms/step - loss: 0.0543 - recall: 0.8147 - val_loss: 0.0594 - val_recall: 0.8468
Epoch 30/100
140/140 - 0s - 2ms/step - loss: 0.0542 - recall: 0.8198 - val_loss: 0.0590 - val_recall: 0.8498
Epoch 31/100
140/140 - 0s - 2ms/step - loss: 0.0529 - recall: 0.8301 - val_loss: 0.0587 - val_recall: 0.8498
Epoch 32/100
140/140 - 0s - 2ms/step - loss: 0.0540 - recall: 0.8275 - val_loss: 0.0581 - val_recall: 0.8498
Epoch 33/100
140/140 - 0s - 2ms/step - loss: 0.0530 - recall: 0.8237 - val_loss: 0.0573 - val_recall: 0.8498
Epoch 34/100
140/140 - 0s - 2ms/step - loss: 0.0500 - recall: 0.8340 - val_loss: 0.0573 - val_recall: 0.8498
Epoch 35/100
140/140 - 0s - 2ms/step - loss: 0.0506 - recall: 0.8314 - val_loss: 0.0571 - val_recall: 0.8498
Epoch 36/100
140/140 - 0s - 2ms/step - loss: 0.0506 - recall: 0.8314 - val_loss: 0.0569 - val_recall: 0.8498
Epoch 37/100
140/140 - 0s - 2ms/step - loss: 0.0501 - recall: 0.8366 - val_loss: 0.0570 - val_recall: 0.8498
Epoch 38/100
140/140 - 0s - 2ms/step - loss: 0.0491 - recall: 0.8430 - val_loss: 0.0564 - val_recall: 0.8498
Epoch 39/100
140/140 - 0s - 2ms/step - loss: 0.0502 - recall: 0.8468 - val_loss: 0.0561 - val_recall: 0.8529
Epoch 40/100
140/140 - 0s - 2ms/step - loss: 0.0488 - recall: 0.8391 - val_loss: 0.0563 - val_recall: 0.8529
Epoch 41/100
140/140 - 0s - 2ms/step - loss: 0.0481 - recall: 0.8456 - val_loss: 0.0560 - val_recall: 0.8529
Epoch 42/100
140/140 - 0s - 2ms/step - loss: 0.0475 - recall: 0.8353 - val_loss: 0.0557 - val_recall: 0.8529
Epoch 43/100
140/140 - 0s - 2ms/step - loss: 0.0495 - recall: 0.8366 - val_loss: 0.0556 - val_recall: 0.8529
Epoch 44/100
140/140 - 0s - 2ms/step - loss: 0.0473 - recall: 0.8468 - val_loss: 0.0553 - val_recall: 0.8529
Epoch 45/100
140/140 - 0s - 2ms/step - loss: 0.0470 - recall: 0.8533 - val_loss: 0.0552 - val_recall: 0.8529
Epoch 46/100
140/140 - 0s - 2ms/step - loss: 0.0461 - recall: 0.8546 - val_loss: 0.0549 - val_recall: 0.8529
Epoch 47/100
140/140 - 0s - 2ms/step - loss: 0.0472 - recall: 0.8494 - val_loss: 0.0548 - val_recall: 0.8529
Epoch 48/100
140/140 - 0s - 2ms/step - loss: 0.0469 - recall: 0.8533 - val_loss: 0.0546 - val_recall: 0.8529
Epoch 49/100
140/140 - 0s - 2ms/step - loss: 0.0467 - recall: 0.8559 - val_loss: 0.0544 - val_recall: 0.8529
Epoch 50/100
140/140 - 0s - 2ms/step - loss: 0.0469 - recall: 0.8494 - val_loss: 0.0544 - val_recall: 0.8529
Epoch 51/100
140/140 - 0s - 2ms/step - loss: 0.0453 - recall: 0.8481 - val_loss: 0.0543 - val_recall: 0.8559
Epoch 52/100
140/140 - 0s - 2ms/step - loss: 0.0448 - recall: 0.8507 - val_loss: 0.0540 - val_recall: 0.8529
Epoch 53/100
140/140 - 0s - 2ms/step - loss: 0.0443 - recall: 0.8571 - val_loss: 0.0538 - val_recall: 0.8498
Epoch 54/100
140/140 - 0s - 2ms/step - loss: 0.0444 - recall: 0.8597 - val_loss: 0.0538 - val_recall: 0.8529
Epoch 55/100
140/140 - 0s - 2ms/step - loss: 0.0439 - recall: 0.8481 - val_loss: 0.0536 - val_recall: 0.8589
Epoch 56/100
140/140 - 0s - 2ms/step - loss: 0.0446 - recall: 0.8571 - val_loss: 0.0533 - val_recall: 0.8559
Epoch 57/100
140/140 - 0s - 2ms/step - loss: 0.0435 - recall: 0.8494 - val_loss: 0.0533 - val_recall: 0.8649
Epoch 58/100
140/140 - 0s - 2ms/step - loss: 0.0423 - recall: 0.8584 - val_loss: 0.0534 - val_recall: 0.8649
Epoch 59/100
140/140 - 0s - 2ms/step - loss: 0.0437 - recall: 0.8430 - val_loss: 0.0533 - val_recall: 0.8619
Epoch 60/100
140/140 - 0s - 2ms/step - loss: 0.0423 - recall: 0.8687 - val_loss: 0.0530 - val_recall: 0.8619
Epoch 61/100
140/140 - 0s - 2ms/step - loss: 0.0434 - recall: 0.8636 - val_loss: 0.0530 - val_recall: 0.8619
Epoch 62/100
140/140 - 0s - 2ms/step - loss: 0.0428 - recall: 0.8636 - val_loss: 0.0534 - val_recall: 0.8589
Epoch 63/100
140/140 - 0s - 2ms/step - loss: 0.0418 - recall: 0.8662 - val_loss: 0.0532 - val_recall: 0.8589
Epoch 64/100
140/140 - 0s - 3ms/step - loss: 0.0407 - recall: 0.8623 - val_loss: 0.0534 - val_recall: 0.8619
Epoch 65/100
140/140 - 0s - 3ms/step - loss: 0.0403 - recall: 0.8649 - val_loss: 0.0532 - val_recall: 0.8649
Epoch 66/100
140/140 - 0s - 3ms/step - loss: 0.0412 - recall: 0.8571 - val_loss: 0.0533 - val_recall: 0.8649
Epoch 67/100
140/140 - 1s - 5ms/step - loss: 0.0420 - recall: 0.8713 - val_loss: 0.0533 - val_recall: 0.8619
Epoch 68/100
140/140 - 0s - 2ms/step - loss: 0.0410 - recall: 0.8610 - val_loss: 0.0530 - val_recall: 0.8649
Epoch 69/100
140/140 - 0s - 2ms/step - loss: 0.0408 - recall: 0.8649 - val_loss: 0.0533 - val_recall: 0.8619
Epoch 70/100
140/140 - 0s - 2ms/step - loss: 0.0403 - recall: 0.8649 - val_loss: 0.0532 - val_recall: 0.8649
Epoch 71/100
140/140 - 0s - 2ms/step - loss: 0.0403 - recall: 0.8752 - val_loss: 0.0531 - val_recall: 0.8679
Epoch 72/100
140/140 - 0s - 2ms/step - loss: 0.0395 - recall: 0.8713 - val_loss: 0.0529 - val_recall: 0.8679
Epoch 73/100
140/140 - 0s - 2ms/step - loss: 0.0388 - recall: 0.8752 - val_loss: 0.0527 - val_recall: 0.8649
Epoch 74/100
140/140 - 0s - 2ms/step - loss: 0.0406 - recall: 0.8649 - val_loss: 0.0532 - val_recall: 0.8649
Epoch 75/100
140/140 - 0s - 2ms/step - loss: 0.0389 - recall: 0.8674 - val_loss: 0.0530 - val_recall: 0.8649
Epoch 76/100
140/140 - 0s - 2ms/step - loss: 0.0404 - recall: 0.8674 - val_loss: 0.0525 - val_recall: 0.8649
Epoch 77/100
140/140 - 0s - 2ms/step - loss: 0.0393 - recall: 0.8752 - val_loss: 0.0526 - val_recall: 0.8679
Epoch 78/100
140/140 - 0s - 2ms/step - loss: 0.0388 - recall: 0.8739 - val_loss: 0.0526 - val_recall: 0.8649
Epoch 79/100
140/140 - 0s - 2ms/step - loss: 0.0378 - recall: 0.8764 - val_loss: 0.0525 - val_recall: 0.8649
Epoch 80/100
140/140 - 0s - 2ms/step - loss: 0.0387 - recall: 0.8739 - val_loss: 0.0525 - val_recall: 0.8679
Epoch 81/100
140/140 - 0s - 3ms/step - loss: 0.0405 - recall: 0.8623 - val_loss: 0.0526 - val_recall: 0.8679
Epoch 82/100
140/140 - 0s - 2ms/step - loss: 0.0397 - recall: 0.8687 - val_loss: 0.0522 - val_recall: 0.8679
Epoch 83/100
140/140 - 0s - 2ms/step - loss: 0.0371 - recall: 0.8700 - val_loss: 0.0524 - val_recall: 0.8679
Epoch 84/100
140/140 - 0s - 2ms/step - loss: 0.0390 - recall: 0.8700 - val_loss: 0.0525 - val_recall: 0.8649
Epoch 85/100
140/140 - 0s - 2ms/step - loss: 0.0389 - recall: 0.8700 - val_loss: 0.0525 - val_recall: 0.8679
Epoch 86/100
140/140 - 0s - 2ms/step - loss: 0.0392 - recall: 0.8700 - val_loss: 0.0522 - val_recall: 0.8649
Epoch 87/100
140/140 - 0s - 2ms/step - loss: 0.0386 - recall: 0.8584 - val_loss: 0.0525 - val_recall: 0.8649
Epoch 88/100
140/140 - 0s - 3ms/step - loss: 0.0364 - recall: 0.8803 - val_loss: 0.0525 - val_recall: 0.8679
Epoch 89/100
140/140 - 0s - 2ms/step - loss: 0.0382 - recall: 0.8726 - val_loss: 0.0524 - val_recall: 0.8679
Epoch 90/100
140/140 - 0s - 2ms/step - loss: 0.0382 - recall: 0.8687 - val_loss: 0.0520 - val_recall: 0.8679
Epoch 91/100
140/140 - 0s - 2ms/step - loss: 0.0367 - recall: 0.8777 - val_loss: 0.0524 - val_recall: 0.8709
Epoch 92/100
140/140 - 0s - 2ms/step - loss: 0.0379 - recall: 0.8726 - val_loss: 0.0525 - val_recall: 0.8709
Epoch 93/100
140/140 - 0s - 2ms/step - loss: 0.0391 - recall: 0.8752 - val_loss: 0.0525 - val_recall: 0.8679
Epoch 94/100
140/140 - 0s - 2ms/step - loss: 0.0373 - recall: 0.8790 - val_loss: 0.0524 - val_recall: 0.8679
Epoch 95/100
140/140 - 0s - 2ms/step - loss: 0.0368 - recall: 0.8662 - val_loss: 0.0526 - val_recall: 0.8679
Epoch 96/100
140/140 - 0s - 2ms/step - loss: 0.0373 - recall: 0.8726 - val_loss: 0.0520 - val_recall: 0.8679
Epoch 97/100
140/140 - 0s - 2ms/step - loss: 0.0365 - recall: 0.8674 - val_loss: 0.0524 - val_recall: 0.8739
Epoch 98/100
140/140 - 0s - 2ms/step - loss: 0.0366 - recall: 0.8790 - val_loss: 0.0520 - val_recall: 0.8739
Epoch 99/100
140/140 - 0s - 2ms/step - loss: 0.0368 - recall: 0.8764 - val_loss: 0.0523 - val_recall: 0.8739
Epoch 100/100
140/140 - 0s - 2ms/step - loss: 0.0373 - recall: 0.8700 - val_loss: 0.0520 - val_recall: 0.8739
In [83]:
#plot model's loss
plot(history, 'loss')
No description has been provided for this image
In [84]:
#plot model's recall
plot(history,'recall')
No description has been provided for this image
  • We see dramatic drops in loss for both training and validation until the 7th epoch, steady decreases until the 20th epoch, then a flattening trend
  • Validation loss is mostly the same as training loss until about the 20th epoch, then remains consistently higher
  • Training loss oscillates quite a bit, but validation does not
  • For recall, both validation and training see dramatic increases until about 17th epoch, then validation remains higher than training until about the 42nd epoch, when both are mostly the same
  • Training recall oscillates much more than validation
In [85]:
#add model to our results df
results.loc[7] = [
  2, #hidden layers
  [64, 128], #neurons/layer
  ["relu", "relu"], #activation function
  epochs, #epochs
  batch_size, #batch size
  "Adam", #optimizer
  [0.0001, "-"], # learning rate, momentum
  ["Xav", "Xav", "Xav"], #weight initializer
  "Dropout (0.2)", #regularization
  history.history["loss"][-1], #train loss
  history.history["val_loss"][-1], #val loss
  history.history["recall"][-1], #train recall
  history.history["val_recall"][-1], #val recall
  round(end-start,2) #trt
  ]
results
Out[85]:
# hidden layers # neurons - hidden layer activation function - hidden layer # epochs batch size optimizer learning rate, momentum weight initializer regularization train loss validation loss train recall validation recall time (secs)
0 1 64 relu 50 100 SGD [0.001, -] - - 0.051616 0.066721 0.832690 0.804805 14.10
1 2 [64, 128] [relu, relu] 50 100 SGD [0.001, -] [Xav, Xav, Xav] - 0.043756 0.060069 0.858430 0.840841 14.28
2 2 [64, 64] [relu, tanh] 50 100 SGD [0.001, -] [Xav, Xav, Xav] - 0.042877 0.054599 0.848134 0.828829 14.53
3 2 [64, 128] [relu, relu] 50 100 SGD with mom [0.001, 0.9] [Xav, Xav, Xav] - 0.019997 0.058326 0.925354 0.870871 16.27
4 2 [64, 128] [relu, relu] 50 100 SGD with mom [0.0001, 0.9] [Xav, Xav, Xav] - 0.085880 0.092282 0.646075 0.672673 15.04
5 2 [64, 128] [relu, relu] 50 100 Adam [0.001, -] [Xav, Xav, Xav] - 0.007018 0.099477 0.963964 0.867868 15.84
6 2 [64, 128] [relu, relu] 100 100 Adam [0.0001, -] [Xav, Xav, Xav] - 0.022984 0.056675 0.909910 0.861862 31.01
7 2 [64, 128] [relu, relu] 100 100 Adam [0.0001, -] [Xav, Xav, Xav] Dropout (0.2) 0.037275 0.052004 0.870013 0.873874 31.34
In [206]:
model_7_train_perf = model_performance_classification(model7, X_train, y_train)
model_7_train_perf
438/438 ━━━━━━━━━━━━━━━━━━━━ 0s 685us/step
Out[206]:
Accuracy Recall Precision F1 Score
0 0.994 0.951397 0.990849 0.97019
In [186]:
model_7_val_perf = model_performance_classification(model7, X_val, y_val)
model_7_val_perf
188/188 ━━━━━━━━━━━━━━━━━━━━ 0s 859us/step
Out[186]:
Accuracy Recall Precision F1 Score
0 0.991667 0.936231 0.982939 0.958244

Model 8¶

Plan:

  • Two hidden layers -- 64, 128
  • activation function -- relu, relu
  • Adam optimizer
  • Reduced learning rate of 1e-4 and increase epochs to 100
  • Dropout of 0.2
  • Batch normalization
In [87]:
#setting the dropout rate
dropout_rate = 0.2
In [88]:
#initialize model

#clear keras session
tf.keras.backend.clear_session()

#initialize neural network
model8 = Sequential()

#hidden layer
model8.add(Dense(64, activation = 'relu', input_shape = (X_train.shape[1],)))

#batch normalization
model8.add(BatchNormalization())

#dropout layer
model8.add(Dropout(dropout_rate))

#hidden layer
model8.add(Dense(128, activation = 'relu'))

#output layer -- one node for binary target variable, and sigmoid for a binary classification problem
model8.add(Dense(1, activation = 'sigmoid'))
In [89]:
#looking at model details
model8.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense (Dense)                   │ (None, 64)             │         2,624 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization             │ (None, 64)             │           256 │
│ (BatchNormalization)            │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout (Dropout)               │ (None, 64)             │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 128)            │         8,320 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_2 (Dense)                 │ (None, 1)              │           129 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 11,329 (44.25 KB)
 Trainable params: 11,201 (43.75 KB)
 Non-trainable params: 128 (512.00 B)
In [90]:
#defining optimizer
lr = 1e-4
optimizer = keras.optimizers.Adam(learning_rate = lr)

#compile model, set optimizer and loss function -- binary crossentropy for binary target variable, with recall as metric
model8.compile(optimizer = optimizer, loss = 'binary_crossentropy', metrics = ['recall'])
In [91]:
#defining batch size and epochs
batch_size = 100
epochs = 100

#fitting model
start = time.time()
history = model8.fit(X_train, y_train, validation_data = (X_val, y_val), batch_size = batch_size, epochs = epochs, verbose = 2)
end=time.time()
Epoch 1/100
140/140 - 1s - 10ms/step - loss: 0.3129 - recall: 0.2857 - val_loss: 0.2006 - val_recall: 0.2943
Epoch 2/100
140/140 - 0s - 2ms/step - loss: 0.1763 - recall: 0.3012 - val_loss: 0.1409 - val_recall: 0.3904
Epoch 3/100
140/140 - 0s - 2ms/step - loss: 0.1358 - recall: 0.3848 - val_loss: 0.1139 - val_recall: 0.4895
Epoch 4/100
140/140 - 0s - 2ms/step - loss: 0.1145 - recall: 0.4582 - val_loss: 0.0992 - val_recall: 0.5556
Epoch 5/100
140/140 - 0s - 2ms/step - loss: 0.1007 - recall: 0.5109 - val_loss: 0.0892 - val_recall: 0.6216
Epoch 6/100
140/140 - 0s - 2ms/step - loss: 0.0915 - recall: 0.5779 - val_loss: 0.0825 - val_recall: 0.6607
Epoch 7/100
140/140 - 0s - 2ms/step - loss: 0.0843 - recall: 0.6023 - val_loss: 0.0775 - val_recall: 0.6877
Epoch 8/100
140/140 - 0s - 2ms/step - loss: 0.0791 - recall: 0.6371 - val_loss: 0.0736 - val_recall: 0.7087
Epoch 9/100
140/140 - 0s - 2ms/step - loss: 0.0764 - recall: 0.6628 - val_loss: 0.0704 - val_recall: 0.7327
Epoch 10/100
140/140 - 0s - 2ms/step - loss: 0.0700 - recall: 0.6988 - val_loss: 0.0676 - val_recall: 0.7477
Epoch 11/100
140/140 - 0s - 2ms/step - loss: 0.0678 - recall: 0.7156 - val_loss: 0.0656 - val_recall: 0.7628
Epoch 12/100
140/140 - 0s - 2ms/step - loss: 0.0657 - recall: 0.7259 - val_loss: 0.0640 - val_recall: 0.7778
Epoch 13/100
140/140 - 0s - 2ms/step - loss: 0.0637 - recall: 0.7272 - val_loss: 0.0624 - val_recall: 0.7868
Epoch 14/100
140/140 - 0s - 2ms/step - loss: 0.0602 - recall: 0.7452 - val_loss: 0.0612 - val_recall: 0.7898
Epoch 15/100
140/140 - 0s - 2ms/step - loss: 0.0598 - recall: 0.7619 - val_loss: 0.0602 - val_recall: 0.7958
Epoch 16/100
140/140 - 0s - 2ms/step - loss: 0.0579 - recall: 0.7773 - val_loss: 0.0591 - val_recall: 0.7928
Epoch 17/100
140/140 - 0s - 2ms/step - loss: 0.0543 - recall: 0.7902 - val_loss: 0.0581 - val_recall: 0.8078
Epoch 18/100
140/140 - 0s - 2ms/step - loss: 0.0542 - recall: 0.7915 - val_loss: 0.0575 - val_recall: 0.8108
Epoch 19/100
140/140 - 0s - 2ms/step - loss: 0.0531 - recall: 0.8095 - val_loss: 0.0567 - val_recall: 0.8168
Epoch 20/100
140/140 - 0s - 2ms/step - loss: 0.0520 - recall: 0.8057 - val_loss: 0.0562 - val_recall: 0.8198
Epoch 21/100
140/140 - 0s - 2ms/step - loss: 0.0514 - recall: 0.8172 - val_loss: 0.0556 - val_recall: 0.8228
Epoch 22/100
140/140 - 0s - 3ms/step - loss: 0.0498 - recall: 0.8224 - val_loss: 0.0550 - val_recall: 0.8348
Epoch 23/100
140/140 - 0s - 3ms/step - loss: 0.0494 - recall: 0.8134 - val_loss: 0.0546 - val_recall: 0.8378
Epoch 24/100
140/140 - 1s - 4ms/step - loss: 0.0477 - recall: 0.8250 - val_loss: 0.0540 - val_recall: 0.8408
Epoch 25/100
140/140 - 1s - 5ms/step - loss: 0.0474 - recall: 0.8301 - val_loss: 0.0537 - val_recall: 0.8438
Epoch 26/100
140/140 - 0s - 2ms/step - loss: 0.0475 - recall: 0.8263 - val_loss: 0.0535 - val_recall: 0.8438
Epoch 27/100
140/140 - 0s - 2ms/step - loss: 0.0473 - recall: 0.8378 - val_loss: 0.0535 - val_recall: 0.8468
Epoch 28/100
140/140 - 0s - 2ms/step - loss: 0.0454 - recall: 0.8366 - val_loss: 0.0531 - val_recall: 0.8468
Epoch 29/100
140/140 - 0s - 3ms/step - loss: 0.0448 - recall: 0.8417 - val_loss: 0.0529 - val_recall: 0.8468
Epoch 30/100
140/140 - 0s - 2ms/step - loss: 0.0453 - recall: 0.8327 - val_loss: 0.0527 - val_recall: 0.8498
Epoch 31/100
140/140 - 0s - 2ms/step - loss: 0.0449 - recall: 0.8417 - val_loss: 0.0522 - val_recall: 0.8498
Epoch 32/100
140/140 - 0s - 2ms/step - loss: 0.0445 - recall: 0.8468 - val_loss: 0.0521 - val_recall: 0.8498
Epoch 33/100
140/140 - 0s - 2ms/step - loss: 0.0440 - recall: 0.8456 - val_loss: 0.0519 - val_recall: 0.8529
Epoch 34/100
140/140 - 0s - 2ms/step - loss: 0.0422 - recall: 0.8430 - val_loss: 0.0519 - val_recall: 0.8529
Epoch 35/100
140/140 - 0s - 2ms/step - loss: 0.0429 - recall: 0.8520 - val_loss: 0.0516 - val_recall: 0.8529
Epoch 36/100
140/140 - 0s - 2ms/step - loss: 0.0447 - recall: 0.8456 - val_loss: 0.0515 - val_recall: 0.8529
Epoch 37/100
140/140 - 0s - 2ms/step - loss: 0.0430 - recall: 0.8468 - val_loss: 0.0515 - val_recall: 0.8529
Epoch 38/100
140/140 - 0s - 2ms/step - loss: 0.0414 - recall: 0.8559 - val_loss: 0.0513 - val_recall: 0.8529
Epoch 39/100
140/140 - 0s - 2ms/step - loss: 0.0418 - recall: 0.8520 - val_loss: 0.0510 - val_recall: 0.8529
Epoch 40/100
140/140 - 0s - 2ms/step - loss: 0.0408 - recall: 0.8571 - val_loss: 0.0508 - val_recall: 0.8529
Epoch 41/100
140/140 - 0s - 2ms/step - loss: 0.0415 - recall: 0.8546 - val_loss: 0.0508 - val_recall: 0.8529
Epoch 42/100
140/140 - 0s - 2ms/step - loss: 0.0398 - recall: 0.8649 - val_loss: 0.0505 - val_recall: 0.8529
Epoch 43/100
140/140 - 0s - 2ms/step - loss: 0.0396 - recall: 0.8636 - val_loss: 0.0505 - val_recall: 0.8529
Epoch 44/100
140/140 - 0s - 2ms/step - loss: 0.0405 - recall: 0.8636 - val_loss: 0.0505 - val_recall: 0.8529
Epoch 45/100
140/140 - 0s - 2ms/step - loss: 0.0401 - recall: 0.8597 - val_loss: 0.0504 - val_recall: 0.8529
Epoch 46/100
140/140 - 0s - 2ms/step - loss: 0.0378 - recall: 0.8752 - val_loss: 0.0501 - val_recall: 0.8529
Epoch 47/100
140/140 - 0s - 2ms/step - loss: 0.0385 - recall: 0.8674 - val_loss: 0.0502 - val_recall: 0.8559
Epoch 48/100
140/140 - 0s - 2ms/step - loss: 0.0391 - recall: 0.8662 - val_loss: 0.0500 - val_recall: 0.8559
Epoch 49/100
140/140 - 1s - 5ms/step - loss: 0.0400 - recall: 0.8597 - val_loss: 0.0498 - val_recall: 0.8559
Epoch 50/100
140/140 - 0s - 2ms/step - loss: 0.0390 - recall: 0.8674 - val_loss: 0.0498 - val_recall: 0.8559
Epoch 51/100
140/140 - 0s - 2ms/step - loss: 0.0375 - recall: 0.8752 - val_loss: 0.0496 - val_recall: 0.8589
Epoch 52/100
140/140 - 0s - 2ms/step - loss: 0.0384 - recall: 0.8700 - val_loss: 0.0495 - val_recall: 0.8559
Epoch 53/100
140/140 - 0s - 2ms/step - loss: 0.0375 - recall: 0.8713 - val_loss: 0.0496 - val_recall: 0.8559
Epoch 54/100
140/140 - 1s - 4ms/step - loss: 0.0384 - recall: 0.8649 - val_loss: 0.0496 - val_recall: 0.8589
Epoch 55/100
140/140 - 0s - 2ms/step - loss: 0.0386 - recall: 0.8700 - val_loss: 0.0494 - val_recall: 0.8589
Epoch 56/100
140/140 - 0s - 3ms/step - loss: 0.0374 - recall: 0.8739 - val_loss: 0.0495 - val_recall: 0.8559
Epoch 57/100
140/140 - 1s - 4ms/step - loss: 0.0379 - recall: 0.8752 - val_loss: 0.0493 - val_recall: 0.8559
Epoch 58/100
140/140 - 0s - 3ms/step - loss: 0.0385 - recall: 0.8687 - val_loss: 0.0494 - val_recall: 0.8589
Epoch 59/100
140/140 - 0s - 3ms/step - loss: 0.0372 - recall: 0.8700 - val_loss: 0.0494 - val_recall: 0.8589
Epoch 60/100
140/140 - 0s - 3ms/step - loss: 0.0370 - recall: 0.8726 - val_loss: 0.0492 - val_recall: 0.8589
Epoch 61/100
140/140 - 0s - 2ms/step - loss: 0.0365 - recall: 0.8752 - val_loss: 0.0493 - val_recall: 0.8589
Epoch 62/100
140/140 - 0s - 2ms/step - loss: 0.0358 - recall: 0.8674 - val_loss: 0.0493 - val_recall: 0.8589
Epoch 63/100
140/140 - 0s - 2ms/step - loss: 0.0367 - recall: 0.8790 - val_loss: 0.0492 - val_recall: 0.8589
Epoch 64/100
140/140 - 0s - 2ms/step - loss: 0.0354 - recall: 0.8829 - val_loss: 0.0492 - val_recall: 0.8589
Epoch 65/100
140/140 - 0s - 2ms/step - loss: 0.0370 - recall: 0.8636 - val_loss: 0.0492 - val_recall: 0.8589
Epoch 66/100
140/140 - 0s - 2ms/step - loss: 0.0353 - recall: 0.8739 - val_loss: 0.0491 - val_recall: 0.8589
Epoch 67/100
140/140 - 0s - 2ms/step - loss: 0.0364 - recall: 0.8777 - val_loss: 0.0489 - val_recall: 0.8589
Epoch 68/100
140/140 - 0s - 2ms/step - loss: 0.0356 - recall: 0.8713 - val_loss: 0.0489 - val_recall: 0.8589
Epoch 69/100
140/140 - 0s - 2ms/step - loss: 0.0365 - recall: 0.8739 - val_loss: 0.0488 - val_recall: 0.8589
Epoch 70/100
140/140 - 0s - 2ms/step - loss: 0.0352 - recall: 0.8713 - val_loss: 0.0488 - val_recall: 0.8589
Epoch 71/100
140/140 - 0s - 2ms/step - loss: 0.0351 - recall: 0.8829 - val_loss: 0.0489 - val_recall: 0.8589
Epoch 72/100
140/140 - 0s - 2ms/step - loss: 0.0356 - recall: 0.8764 - val_loss: 0.0491 - val_recall: 0.8589
Epoch 73/100
140/140 - 0s - 2ms/step - loss: 0.0346 - recall: 0.8777 - val_loss: 0.0488 - val_recall: 0.8589
Epoch 74/100
140/140 - 0s - 2ms/step - loss: 0.0349 - recall: 0.8713 - val_loss: 0.0488 - val_recall: 0.8559
Epoch 75/100
140/140 - 0s - 2ms/step - loss: 0.0341 - recall: 0.8880 - val_loss: 0.0488 - val_recall: 0.8589
Epoch 76/100
140/140 - 0s - 2ms/step - loss: 0.0349 - recall: 0.8816 - val_loss: 0.0488 - val_recall: 0.8589
Epoch 77/100
140/140 - 0s - 2ms/step - loss: 0.0353 - recall: 0.8739 - val_loss: 0.0487 - val_recall: 0.8589
Epoch 78/100
140/140 - 0s - 2ms/step - loss: 0.0338 - recall: 0.8855 - val_loss: 0.0489 - val_recall: 0.8619
Epoch 79/100
140/140 - 0s - 2ms/step - loss: 0.0333 - recall: 0.8816 - val_loss: 0.0487 - val_recall: 0.8649
Epoch 80/100
140/140 - 0s - 2ms/step - loss: 0.0340 - recall: 0.8790 - val_loss: 0.0487 - val_recall: 0.8649
Epoch 81/100
140/140 - 0s - 2ms/step - loss: 0.0339 - recall: 0.8829 - val_loss: 0.0488 - val_recall: 0.8649
Epoch 82/100
140/140 - 0s - 2ms/step - loss: 0.0341 - recall: 0.8855 - val_loss: 0.0485 - val_recall: 0.8679
Epoch 83/100
140/140 - 0s - 2ms/step - loss: 0.0331 - recall: 0.8829 - val_loss: 0.0487 - val_recall: 0.8679
Epoch 84/100
140/140 - 0s - 2ms/step - loss: 0.0336 - recall: 0.8867 - val_loss: 0.0485 - val_recall: 0.8649
Epoch 85/100
140/140 - 0s - 2ms/step - loss: 0.0328 - recall: 0.8867 - val_loss: 0.0488 - val_recall: 0.8679
Epoch 86/100
140/140 - 0s - 2ms/step - loss: 0.0340 - recall: 0.8919 - val_loss: 0.0487 - val_recall: 0.8649
Epoch 87/100
140/140 - 0s - 2ms/step - loss: 0.0335 - recall: 0.8855 - val_loss: 0.0488 - val_recall: 0.8709
Epoch 88/100
140/140 - 0s - 2ms/step - loss: 0.0336 - recall: 0.8829 - val_loss: 0.0488 - val_recall: 0.8679
Epoch 89/100
140/140 - 0s - 2ms/step - loss: 0.0334 - recall: 0.8803 - val_loss: 0.0490 - val_recall: 0.8649
Epoch 90/100
140/140 - 0s - 2ms/step - loss: 0.0320 - recall: 0.8893 - val_loss: 0.0488 - val_recall: 0.8679
Epoch 91/100
140/140 - 0s - 2ms/step - loss: 0.0335 - recall: 0.8803 - val_loss: 0.0490 - val_recall: 0.8679
Epoch 92/100
140/140 - 1s - 5ms/step - loss: 0.0319 - recall: 0.8893 - val_loss: 0.0489 - val_recall: 0.8679
Epoch 93/100
140/140 - 1s - 5ms/step - loss: 0.0332 - recall: 0.8893 - val_loss: 0.0490 - val_recall: 0.8679
Epoch 94/100
140/140 - 0s - 3ms/step - loss: 0.0326 - recall: 0.8880 - val_loss: 0.0490 - val_recall: 0.8679
Epoch 95/100
140/140 - 0s - 4ms/step - loss: 0.0338 - recall: 0.8855 - val_loss: 0.0488 - val_recall: 0.8649
Epoch 96/100
140/140 - 0s - 2ms/step - loss: 0.0333 - recall: 0.8842 - val_loss: 0.0491 - val_recall: 0.8679
Epoch 97/100
140/140 - 0s - 2ms/step - loss: 0.0330 - recall: 0.8880 - val_loss: 0.0488 - val_recall: 0.8709
Epoch 98/100
140/140 - 0s - 2ms/step - loss: 0.0322 - recall: 0.8906 - val_loss: 0.0489 - val_recall: 0.8709
Epoch 99/100
140/140 - 0s - 2ms/step - loss: 0.0322 - recall: 0.8867 - val_loss: 0.0491 - val_recall: 0.8679
Epoch 100/100
140/140 - 0s - 2ms/step - loss: 0.0334 - recall: 0.8855 - val_loss: 0.0488 - val_recall: 0.8739
In [92]:
#plot model's loss
plot(history, 'loss')
No description has been provided for this image
In [93]:
#plot model's recall
plot(history,'recall')
No description has been provided for this image
  • We see dramatic drops in loss until about the 8th epoch, then we mostly * flatten out after the 20th epoch
  • Validation loss is lower than training loss until about 18th epoch, then it remains higher -- but the gap is slim
  • For recall, we see more oscillation in training than validation
  • Both have dramatic increases until the 18th epoch before slowing down and flattening
  • Validation recall remains higher until about the 38th epoch, when training recall remains higher
In [94]:
#add model to our results df
results.loc[8] = [
  2, #hidden layers
  [64, 128], #neurons/layer
  ["relu", "relu"], #activation function
  epochs, #epochs
  batch_size, #batch size
  "Adam", #optimizer
  [0.0001, "-"], # learning rate, momentum
  ["Xav", "Xav", "Xav"], #weight initializer
  ["Batch norm", "Dropout (0.2)"], #regularization
  history.history["loss"][-1], #train loss
  history.history["val_loss"][-1], #val loss
  history.history["recall"][-1], #train recall
  history.history["val_recall"][-1], #val recall
  round(end-start,2) #trt
  ]
results
Out[94]:
# hidden layers # neurons - hidden layer activation function - hidden layer # epochs batch size optimizer learning rate, momentum weight initializer regularization train loss validation loss train recall validation recall time (secs)
0 1 64 relu 50 100 SGD [0.001, -] - - 0.051616 0.066721 0.832690 0.804805 14.10
1 2 [64, 128] [relu, relu] 50 100 SGD [0.001, -] [Xav, Xav, Xav] - 0.043756 0.060069 0.858430 0.840841 14.28
2 2 [64, 64] [relu, tanh] 50 100 SGD [0.001, -] [Xav, Xav, Xav] - 0.042877 0.054599 0.848134 0.828829 14.53
3 2 [64, 128] [relu, relu] 50 100 SGD with mom [0.001, 0.9] [Xav, Xav, Xav] - 0.019997 0.058326 0.925354 0.870871 16.27
4 2 [64, 128] [relu, relu] 50 100 SGD with mom [0.0001, 0.9] [Xav, Xav, Xav] - 0.085880 0.092282 0.646075 0.672673 15.04
5 2 [64, 128] [relu, relu] 50 100 Adam [0.001, -] [Xav, Xav, Xav] - 0.007018 0.099477 0.963964 0.867868 15.84
6 2 [64, 128] [relu, relu] 100 100 Adam [0.0001, -] [Xav, Xav, Xav] - 0.022984 0.056675 0.909910 0.861862 31.01
7 2 [64, 128] [relu, relu] 100 100 Adam [0.0001, -] [Xav, Xav, Xav] Dropout (0.2) 0.037275 0.052004 0.870013 0.873874 31.34
8 2 [64, 128] [relu, relu] 100 100 Adam [0.0001, -] [Xav, Xav, Xav] [Batch norm, Dropout (0.2)] 0.033385 0.048827 0.885457 0.873874 35.42
In [187]:
model_8_train_perf = model_performance_classification(model8, X_train, y_train)
model_8_train_perf
438/438 ━━━━━━━━━━━━━━━━━━━━ 0s 791us/step
Out[187]:
Accuracy Recall Precision F1 Score
0 0.994071 0.952041 0.990895 0.970564
In [188]:
model_8_val_perf = model_performance_classification(model8, X_val, y_val)
model_8_val_perf
188/188 ━━━━━━━━━━━━━━━━━━━━ 0s 792us/step
Out[188]:
Accuracy Recall Precision F1 Score
0 0.9915 0.936143 0.981316 0.957472

Model 9¶

Plan:

  • Adjusting class weights for imbalanced class distribution
  • Two hidden layers -- 64, 128
  • activation function -- relu, relu
  • Adam optimizer
  • Reduced learning rate of 1e-4 and increase epochs to 100
  • Dropout of 0.2
  • Batch normalization
In [96]:
#calculate class weights
cw = (y_train.shape[0]) / np.bincount(y_train)

#create a dictionary with class indices and their weights
cw_dict = {}
for i in range(cw.shape[0]):
    cw_dict[i] = cw[i]

cw_dict
Out[96]:
{0: np.float64(1.0587612493382743), 1: np.float64(18.01801801801802)}
In [97]:
#setting the dropout rate
dropout_rate = 0.2
In [98]:
#initialize model

#clear keras session
tf.keras.backend.clear_session()

#initialize neural network
model9 = Sequential()

#hidden layer
model9.add(Dense(64, activation = 'relu', input_shape = (X_train.shape[1],)))

#batch normalization
model9.add(BatchNormalization())

#dropout layer
model9.add(Dropout(dropout_rate))

#hidden layer
model9.add(Dense(128, activation = 'relu'))

#output layer -- one node for binary target variable, and sigmoid for a binary classification problem
model9.add(Dense(1, activation = 'sigmoid'))
In [99]:
#looking at model details
model9.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense (Dense)                   │ (None, 64)             │         2,624 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization             │ (None, 64)             │           256 │
│ (BatchNormalization)            │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout (Dropout)               │ (None, 64)             │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 128)            │         8,320 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_2 (Dense)                 │ (None, 1)              │           129 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 11,329 (44.25 KB)
 Trainable params: 11,201 (43.75 KB)
 Non-trainable params: 128 (512.00 B)
In [100]:
#defining optimizer
lr = 1e-4
optimizer = keras.optimizers.Adam(learning_rate = lr)

#compile model, set optimizer and loss function -- binary crossentropy for binary target variable, with recall as metric
model9.compile(optimizer = optimizer, loss = 'binary_crossentropy', metrics = ['recall'])
In [101]:
#defining batch size and epochs
batch_size = 100
epochs = 100

#fitting model
start = time.time()
history = model9.fit(X_train, y_train, validation_data = (X_val, y_val), batch_size = batch_size, epochs = epochs, class_weight = cw_dict, verbose = 2)
end=time.time()
Epoch 1/100
140/140 - 1s - 11ms/step - loss: 1.1896 - recall: 0.7439 - val_loss: 0.5574 - val_recall: 0.7988
Epoch 2/100
140/140 - 0s - 2ms/step - loss: 0.8693 - recall: 0.8391 - val_loss: 0.4032 - val_recall: 0.8138
Epoch 3/100
140/140 - 1s - 4ms/step - loss: 0.7422 - recall: 0.8584 - val_loss: 0.3318 - val_recall: 0.8348
Epoch 4/100
140/140 - 0s - 2ms/step - loss: 0.6778 - recall: 0.8662 - val_loss: 0.2931 - val_recall: 0.8529
Epoch 5/100
140/140 - 0s - 2ms/step - loss: 0.6183 - recall: 0.8816 - val_loss: 0.2621 - val_recall: 0.8619
Epoch 6/100
140/140 - 0s - 2ms/step - loss: 0.5824 - recall: 0.8842 - val_loss: 0.2408 - val_recall: 0.8709
Epoch 7/100
140/140 - 0s - 2ms/step - loss: 0.5532 - recall: 0.8932 - val_loss: 0.2252 - val_recall: 0.8649
Epoch 8/100
140/140 - 0s - 2ms/step - loss: 0.5343 - recall: 0.8932 - val_loss: 0.2133 - val_recall: 0.8769
Epoch 9/100
140/140 - 0s - 2ms/step - loss: 0.5062 - recall: 0.9060 - val_loss: 0.2010 - val_recall: 0.8769
Epoch 10/100
140/140 - 0s - 2ms/step - loss: 0.5065 - recall: 0.8906 - val_loss: 0.1928 - val_recall: 0.8769
Epoch 11/100
140/140 - 0s - 2ms/step - loss: 0.4894 - recall: 0.8919 - val_loss: 0.1853 - val_recall: 0.8769
Epoch 12/100
140/140 - 0s - 2ms/step - loss: 0.4680 - recall: 0.9022 - val_loss: 0.1789 - val_recall: 0.8769
Epoch 13/100
140/140 - 0s - 2ms/step - loss: 0.4573 - recall: 0.8970 - val_loss: 0.1732 - val_recall: 0.8769
Epoch 14/100
140/140 - 0s - 2ms/step - loss: 0.4511 - recall: 0.8983 - val_loss: 0.1700 - val_recall: 0.8769
Epoch 15/100
140/140 - 0s - 2ms/step - loss: 0.4334 - recall: 0.9099 - val_loss: 0.1647 - val_recall: 0.8769
Epoch 16/100
140/140 - 0s - 2ms/step - loss: 0.4325 - recall: 0.9060 - val_loss: 0.1622 - val_recall: 0.8769
Epoch 17/100
140/140 - 0s - 2ms/step - loss: 0.4297 - recall: 0.8983 - val_loss: 0.1607 - val_recall: 0.8769
Epoch 18/100
140/140 - 0s - 2ms/step - loss: 0.4222 - recall: 0.9035 - val_loss: 0.1577 - val_recall: 0.8769
Epoch 19/100
140/140 - 0s - 3ms/step - loss: 0.4145 - recall: 0.9009 - val_loss: 0.1564 - val_recall: 0.8769
Epoch 20/100
140/140 - 0s - 3ms/step - loss: 0.4046 - recall: 0.9035 - val_loss: 0.1529 - val_recall: 0.8769
Epoch 21/100
140/140 - 0s - 3ms/step - loss: 0.3983 - recall: 0.9099 - val_loss: 0.1504 - val_recall: 0.8799
Epoch 22/100
140/140 - 0s - 3ms/step - loss: 0.4052 - recall: 0.8996 - val_loss: 0.1481 - val_recall: 0.8739
Epoch 23/100
140/140 - 0s - 3ms/step - loss: 0.3875 - recall: 0.9035 - val_loss: 0.1458 - val_recall: 0.8739
Epoch 24/100
140/140 - 0s - 2ms/step - loss: 0.3828 - recall: 0.9073 - val_loss: 0.1442 - val_recall: 0.8739
Epoch 25/100
140/140 - 0s - 2ms/step - loss: 0.3784 - recall: 0.9112 - val_loss: 0.1429 - val_recall: 0.8739
Epoch 26/100
140/140 - 0s - 2ms/step - loss: 0.3782 - recall: 0.9073 - val_loss: 0.1404 - val_recall: 0.8739
Epoch 27/100
140/140 - 0s - 2ms/step - loss: 0.3695 - recall: 0.9125 - val_loss: 0.1392 - val_recall: 0.8739
Epoch 28/100
140/140 - 0s - 2ms/step - loss: 0.3667 - recall: 0.9048 - val_loss: 0.1396 - val_recall: 0.8739
Epoch 29/100
140/140 - 0s - 2ms/step - loss: 0.3804 - recall: 0.9073 - val_loss: 0.1424 - val_recall: 0.8739
Epoch 30/100
140/140 - 0s - 2ms/step - loss: 0.3713 - recall: 0.9099 - val_loss: 0.1368 - val_recall: 0.8739
Epoch 31/100
140/140 - 0s - 2ms/step - loss: 0.3478 - recall: 0.9176 - val_loss: 0.1334 - val_recall: 0.8739
Epoch 32/100
140/140 - 0s - 2ms/step - loss: 0.3399 - recall: 0.9138 - val_loss: 0.1330 - val_recall: 0.8739
Epoch 33/100
140/140 - 0s - 2ms/step - loss: 0.3499 - recall: 0.9138 - val_loss: 0.1337 - val_recall: 0.8739
Epoch 34/100
140/140 - 0s - 2ms/step - loss: 0.3429 - recall: 0.9176 - val_loss: 0.1296 - val_recall: 0.8739
Epoch 35/100
140/140 - 0s - 2ms/step - loss: 0.3426 - recall: 0.9086 - val_loss: 0.1322 - val_recall: 0.8739
Epoch 36/100
140/140 - 0s - 2ms/step - loss: 0.3493 - recall: 0.9151 - val_loss: 0.1298 - val_recall: 0.8739
Epoch 37/100
140/140 - 0s - 2ms/step - loss: 0.3387 - recall: 0.9125 - val_loss: 0.1290 - val_recall: 0.8739
Epoch 38/100
140/140 - 0s - 2ms/step - loss: 0.3395 - recall: 0.9112 - val_loss: 0.1268 - val_recall: 0.8739
Epoch 39/100
140/140 - 0s - 2ms/step - loss: 0.3221 - recall: 0.9151 - val_loss: 0.1252 - val_recall: 0.8739
Epoch 40/100
140/140 - 0s - 2ms/step - loss: 0.3306 - recall: 0.9112 - val_loss: 0.1251 - val_recall: 0.8769
Epoch 41/100
140/140 - 0s - 2ms/step - loss: 0.3293 - recall: 0.9138 - val_loss: 0.1258 - val_recall: 0.8769
Epoch 42/100
140/140 - 0s - 2ms/step - loss: 0.3274 - recall: 0.9163 - val_loss: 0.1248 - val_recall: 0.8769
Epoch 43/100
140/140 - 0s - 2ms/step - loss: 0.3262 - recall: 0.9228 - val_loss: 0.1247 - val_recall: 0.8769
Epoch 44/100
140/140 - 0s - 2ms/step - loss: 0.3186 - recall: 0.9189 - val_loss: 0.1223 - val_recall: 0.8769
Epoch 45/100
140/140 - 0s - 2ms/step - loss: 0.3182 - recall: 0.9176 - val_loss: 0.1215 - val_recall: 0.8769
Epoch 46/100
140/140 - 0s - 2ms/step - loss: 0.3239 - recall: 0.9125 - val_loss: 0.1242 - val_recall: 0.8769
Epoch 47/100
140/140 - 0s - 2ms/step - loss: 0.3125 - recall: 0.9163 - val_loss: 0.1220 - val_recall: 0.8799
Epoch 48/100
140/140 - 0s - 2ms/step - loss: 0.3143 - recall: 0.9189 - val_loss: 0.1210 - val_recall: 0.8799
Epoch 49/100
140/140 - 0s - 2ms/step - loss: 0.3157 - recall: 0.9086 - val_loss: 0.1216 - val_recall: 0.8829
Epoch 50/100
140/140 - 0s - 2ms/step - loss: 0.2980 - recall: 0.9215 - val_loss: 0.1197 - val_recall: 0.8799
Epoch 51/100
140/140 - 0s - 2ms/step - loss: 0.3027 - recall: 0.9189 - val_loss: 0.1180 - val_recall: 0.8769
Epoch 52/100
140/140 - 0s - 2ms/step - loss: 0.3167 - recall: 0.9086 - val_loss: 0.1230 - val_recall: 0.8829
Epoch 53/100
140/140 - 0s - 2ms/step - loss: 0.3037 - recall: 0.9138 - val_loss: 0.1194 - val_recall: 0.8799
Epoch 54/100
140/140 - 0s - 2ms/step - loss: 0.3011 - recall: 0.9163 - val_loss: 0.1195 - val_recall: 0.8799
Epoch 55/100
140/140 - 0s - 3ms/step - loss: 0.2980 - recall: 0.9138 - val_loss: 0.1200 - val_recall: 0.8829
Epoch 56/100
140/140 - 0s - 3ms/step - loss: 0.2928 - recall: 0.9202 - val_loss: 0.1174 - val_recall: 0.8829
Epoch 57/100
140/140 - 0s - 3ms/step - loss: 0.2979 - recall: 0.9151 - val_loss: 0.1184 - val_recall: 0.8829
Epoch 58/100
140/140 - 0s - 3ms/step - loss: 0.2940 - recall: 0.9215 - val_loss: 0.1156 - val_recall: 0.8829
Epoch 59/100
140/140 - 0s - 4ms/step - loss: 0.2951 - recall: 0.9086 - val_loss: 0.1197 - val_recall: 0.8829
Epoch 60/100
140/140 - 0s - 3ms/step - loss: 0.2716 - recall: 0.9279 - val_loss: 0.1146 - val_recall: 0.8829
Epoch 61/100
140/140 - 0s - 2ms/step - loss: 0.2862 - recall: 0.9151 - val_loss: 0.1154 - val_recall: 0.8829
Epoch 62/100
140/140 - 0s - 2ms/step - loss: 0.2859 - recall: 0.9202 - val_loss: 0.1158 - val_recall: 0.8829
Epoch 63/100
140/140 - 0s - 2ms/step - loss: 0.2694 - recall: 0.9241 - val_loss: 0.1130 - val_recall: 0.8859
Epoch 64/100
140/140 - 0s - 2ms/step - loss: 0.2841 - recall: 0.9202 - val_loss: 0.1142 - val_recall: 0.8829
Epoch 65/100
140/140 - 0s - 2ms/step - loss: 0.2874 - recall: 0.9138 - val_loss: 0.1151 - val_recall: 0.8859
Epoch 66/100
140/140 - 0s - 2ms/step - loss: 0.2864 - recall: 0.9189 - val_loss: 0.1155 - val_recall: 0.8889
Epoch 67/100
140/140 - 0s - 2ms/step - loss: 0.2784 - recall: 0.9176 - val_loss: 0.1131 - val_recall: 0.8859
Epoch 68/100
140/140 - 0s - 2ms/step - loss: 0.2776 - recall: 0.9215 - val_loss: 0.1151 - val_recall: 0.8859
Epoch 69/100
140/140 - 0s - 2ms/step - loss: 0.2709 - recall: 0.9241 - val_loss: 0.1131 - val_recall: 0.8859
Epoch 70/100
140/140 - 0s - 2ms/step - loss: 0.2762 - recall: 0.9215 - val_loss: 0.1136 - val_recall: 0.8859
Epoch 71/100
140/140 - 0s - 2ms/step - loss: 0.2690 - recall: 0.9228 - val_loss: 0.1119 - val_recall: 0.8859
Epoch 72/100
140/140 - 0s - 2ms/step - loss: 0.2681 - recall: 0.9254 - val_loss: 0.1102 - val_recall: 0.8859
Epoch 73/100
140/140 - 0s - 2ms/step - loss: 0.2647 - recall: 0.9189 - val_loss: 0.1104 - val_recall: 0.8859
Epoch 74/100
140/140 - 0s - 2ms/step - loss: 0.2652 - recall: 0.9279 - val_loss: 0.1117 - val_recall: 0.8859
Epoch 75/100
140/140 - 0s - 2ms/step - loss: 0.2615 - recall: 0.9241 - val_loss: 0.1079 - val_recall: 0.8859
Epoch 76/100
140/140 - 0s - 2ms/step - loss: 0.2692 - recall: 0.9228 - val_loss: 0.1098 - val_recall: 0.8859
Epoch 77/100
140/140 - 0s - 2ms/step - loss: 0.2572 - recall: 0.9228 - val_loss: 0.1107 - val_recall: 0.8859
Epoch 78/100
140/140 - 0s - 2ms/step - loss: 0.2599 - recall: 0.9266 - val_loss: 0.1105 - val_recall: 0.8889
Epoch 79/100
140/140 - 0s - 2ms/step - loss: 0.2531 - recall: 0.9292 - val_loss: 0.1065 - val_recall: 0.8889
Epoch 80/100
140/140 - 0s - 2ms/step - loss: 0.2552 - recall: 0.9202 - val_loss: 0.1081 - val_recall: 0.8889
Epoch 81/100
140/140 - 0s - 2ms/step - loss: 0.2655 - recall: 0.9202 - val_loss: 0.1098 - val_recall: 0.8889
Epoch 82/100
140/140 - 0s - 2ms/step - loss: 0.2564 - recall: 0.9305 - val_loss: 0.1077 - val_recall: 0.8889
Epoch 83/100
140/140 - 0s - 2ms/step - loss: 0.2577 - recall: 0.9228 - val_loss: 0.1102 - val_recall: 0.8889
Epoch 84/100
140/140 - 0s - 2ms/step - loss: 0.2482 - recall: 0.9254 - val_loss: 0.1067 - val_recall: 0.8889
Epoch 85/100
140/140 - 0s - 2ms/step - loss: 0.2496 - recall: 0.9305 - val_loss: 0.1058 - val_recall: 0.8889
Epoch 86/100
140/140 - 0s - 2ms/step - loss: 0.2508 - recall: 0.9266 - val_loss: 0.1066 - val_recall: 0.8889
Epoch 87/100
140/140 - 0s - 2ms/step - loss: 0.2441 - recall: 0.9369 - val_loss: 0.1054 - val_recall: 0.8859
Epoch 88/100
140/140 - 0s - 2ms/step - loss: 0.2436 - recall: 0.9279 - val_loss: 0.1050 - val_recall: 0.8889
Epoch 89/100
140/140 - 0s - 2ms/step - loss: 0.2474 - recall: 0.9266 - val_loss: 0.1068 - val_recall: 0.8889
Epoch 90/100
140/140 - 0s - 2ms/step - loss: 0.2606 - recall: 0.9292 - val_loss: 0.1080 - val_recall: 0.8889
Epoch 91/100
140/140 - 0s - 2ms/step - loss: 0.2441 - recall: 0.9292 - val_loss: 0.1071 - val_recall: 0.8889
Epoch 92/100
140/140 - 0s - 3ms/step - loss: 0.2367 - recall: 0.9292 - val_loss: 0.1053 - val_recall: 0.8889
Epoch 93/100
140/140 - 0s - 3ms/step - loss: 0.2523 - recall: 0.9254 - val_loss: 0.1080 - val_recall: 0.8889
Epoch 94/100
140/140 - 1s - 4ms/step - loss: 0.2346 - recall: 0.9318 - val_loss: 0.1057 - val_recall: 0.8889
Epoch 95/100
140/140 - 0s - 3ms/step - loss: 0.2466 - recall: 0.9215 - val_loss: 0.1073 - val_recall: 0.8889
Epoch 96/100
140/140 - 0s - 3ms/step - loss: 0.2500 - recall: 0.9305 - val_loss: 0.1049 - val_recall: 0.8859
Epoch 97/100
140/140 - 0s - 2ms/step - loss: 0.2338 - recall: 0.9305 - val_loss: 0.1050 - val_recall: 0.8889
Epoch 98/100
140/140 - 0s - 2ms/step - loss: 0.2582 - recall: 0.9254 - val_loss: 0.1055 - val_recall: 0.8889
Epoch 99/100
140/140 - 0s - 2ms/step - loss: 0.2380 - recall: 0.9305 - val_loss: 0.1039 - val_recall: 0.8889
Epoch 100/100
140/140 - 0s - 2ms/step - loss: 0.2284 - recall: 0.9356 - val_loss: 0.1028 - val_recall: 0.8889
In [102]:
#plot model's loss
plot(history, 'loss')
No description has been provided for this image
In [103]:
#plot model's recall
plot(history, 'recall')
No description has been provided for this image
  • Training and validation loss fall dramatically until about the 10th epoch, then both slow down and plateau -- though the validation plateau is much flatter
  • Validation loss is also lower than training loss For recall, training sets see dramatic increases until about 8 epochs, then training keeps rising while oscillating
  • Meantime, validation recall drops slightly before slight increases with plateaus and oscillations
  • The gap between training and recall is wider
In [104]:
#add model to our results df
results.loc[9] = [
  2, #hidden layers
  [64, 128], #neurons/layer
  ["relu", "relu"], #activation function
  epochs, #epochs
  batch_size, #batch size
  "Adam", #optimizer
  [0.0001, "-"], # learning rate, momentum
  ["Xav", "Xav", "Xav"], #weight initializer
  ["Batch norm", "Dropout (0.2)", "Class weights"], #regularization
  history.history["loss"][-1], #train loss
  history.history["val_loss"][-1], #val loss
  history.history["recall"][-1], #train recall
  history.history["val_recall"][-1], #val recall
  round(end-start,2) #trt
  ]
results
Out[104]:
# hidden layers # neurons - hidden layer activation function - hidden layer # epochs batch size optimizer learning rate, momentum weight initializer regularization train loss validation loss train recall validation recall time (secs)
0 1 64 relu 50 100 SGD [0.001, -] - - 0.051616 0.066721 0.832690 0.804805 14.10
1 2 [64, 128] [relu, relu] 50 100 SGD [0.001, -] [Xav, Xav, Xav] - 0.043756 0.060069 0.858430 0.840841 14.28
2 2 [64, 64] [relu, tanh] 50 100 SGD [0.001, -] [Xav, Xav, Xav] - 0.042877 0.054599 0.848134 0.828829 14.53
3 2 [64, 128] [relu, relu] 50 100 SGD with mom [0.001, 0.9] [Xav, Xav, Xav] - 0.019997 0.058326 0.925354 0.870871 16.27
4 2 [64, 128] [relu, relu] 50 100 SGD with mom [0.0001, 0.9] [Xav, Xav, Xav] - 0.085880 0.092282 0.646075 0.672673 15.04
5 2 [64, 128] [relu, relu] 50 100 Adam [0.001, -] [Xav, Xav, Xav] - 0.007018 0.099477 0.963964 0.867868 15.84
6 2 [64, 128] [relu, relu] 100 100 Adam [0.0001, -] [Xav, Xav, Xav] - 0.022984 0.056675 0.909910 0.861862 31.01
7 2 [64, 128] [relu, relu] 100 100 Adam [0.0001, -] [Xav, Xav, Xav] Dropout (0.2) 0.037275 0.052004 0.870013 0.873874 31.34
8 2 [64, 128] [relu, relu] 100 100 Adam [0.0001, -] [Xav, Xav, Xav] [Batch norm, Dropout (0.2)] 0.033385 0.048827 0.885457 0.873874 35.42
9 2 [64, 128] [relu, relu] 100 100 Adam [0.0001, -] [Xav, Xav, Xav] [Batch norm, Dropout (0.2), Class weights] 0.228398 0.102819 0.935650 0.888889 34.58
In [190]:
model_9_train_perf = model_performance_classification(model9, X_train, y_train)
model_9_train_perf
438/438 ━━━━━━━━━━━━━━━━━━━━ 0s 733us/step
Out[190]:
Accuracy Recall Precision F1 Score
0 0.986929 0.962796 0.920772 0.940634
In [191]:
model_9_val_perf = model_performance_classification(model9, X_val, y_val)
model_9_val_perf
188/188 ━━━━━━━━━━━━━━━━━━━━ 0s 784us/step
Out[191]:
Accuracy Recall Precision F1 Score
0 0.9815 0.937915 0.896714 0.91614

Model 10¶

Plan:

  • Adjusting class weights for imbalanced class distribution
  • Two hidden layers -- 64, 128
  • activation function -- relu, relu
  • Adam optimizer
  • Reduced learning rate of 1e-4 and increase epochs to 100
  • Dropout of 0.2
  • Batch normalization
  • He initialization
In [106]:
#calculate class weights
cw = (y_train.shape[0]) / np.bincount(y_train)

#create a dictionary with class indices and their weights
cw_dict = {}
for i in range(cw.shape[0]):
    cw_dict[i] = cw[i]

cw_dict
Out[106]:
{0: np.float64(1.0587612493382743), 1: np.float64(18.01801801801802)}
In [107]:
#setting the dropout rate
dropout_rate = 0.2
In [108]:
#initialize model

#clear keras session
tf.keras.backend.clear_session()

#initialize neural network
model10 = Sequential()

#hidden layer
model10.add(Dense(64, activation = 'relu', input_shape = (X_train.shape[1],), kernel_initializer = 'he_normal'))

#batch normalization
model10.add(BatchNormalization())

#dropout layer
model10.add(Dropout(dropout_rate))

#hidden layer
model10.add(Dense(128, activation = 'relu', kernel_initializer = 'he_normal'))

#output layer -- one node for binary target variable, and sigmoid for a binary classification problem
model10.add(Dense(1, activation = 'sigmoid')) #leaving default Xavier initialization for the output layer since it's sigmoid
In [109]:
#looking at model details
model10.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense (Dense)                   │ (None, 64)             │         2,624 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization             │ (None, 64)             │           256 │
│ (BatchNormalization)            │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout (Dropout)               │ (None, 64)             │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 128)            │         8,320 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_2 (Dense)                 │ (None, 1)              │           129 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 11,329 (44.25 KB)
 Trainable params: 11,201 (43.75 KB)
 Non-trainable params: 128 (512.00 B)
In [110]:
#defining optimizer
lr = 1e-4
optimizer = keras.optimizers.Adam(learning_rate = lr)

#compile model, set optimizer and loss function -- binary crossentropy for binary target variable, with recall as metric
model10.compile(optimizer = optimizer, loss = 'binary_crossentropy', metrics = ['recall'])
In [111]:
#defining batch size and epochs
batch_size = 100
epochs = 100

#fitting model
start = time.time()
history = model10.fit(X_train, y_train, validation_data = (X_val, y_val), batch_size = batch_size, epochs = epochs, class_weight = cw_dict, verbose = 2)
end=time.time()
Epoch 1/100
140/140 - 1s - 10ms/step - loss: 1.1740 - recall: 0.7490 - val_loss: 0.5815 - val_recall: 0.8048
Epoch 2/100
140/140 - 0s - 3ms/step - loss: 0.9178 - recall: 0.8237 - val_loss: 0.4536 - val_recall: 0.8408
Epoch 3/100
140/140 - 0s - 2ms/step - loss: 0.8096 - recall: 0.8378 - val_loss: 0.3854 - val_recall: 0.8408
Epoch 4/100
140/140 - 0s - 2ms/step - loss: 0.7260 - recall: 0.8571 - val_loss: 0.3466 - val_recall: 0.8498
Epoch 5/100
140/140 - 0s - 2ms/step - loss: 0.6946 - recall: 0.8571 - val_loss: 0.3157 - val_recall: 0.8498
Epoch 6/100
140/140 - 0s - 2ms/step - loss: 0.6504 - recall: 0.8674 - val_loss: 0.2941 - val_recall: 0.8619
Epoch 7/100
140/140 - 0s - 2ms/step - loss: 0.6329 - recall: 0.8597 - val_loss: 0.2770 - val_recall: 0.8619
Epoch 8/100
140/140 - 0s - 2ms/step - loss: 0.6073 - recall: 0.8752 - val_loss: 0.2653 - val_recall: 0.8649
Epoch 9/100
140/140 - 0s - 3ms/step - loss: 0.5907 - recall: 0.8790 - val_loss: 0.2542 - val_recall: 0.8679
Epoch 10/100
140/140 - 0s - 3ms/step - loss: 0.5650 - recall: 0.8880 - val_loss: 0.2365 - val_recall: 0.8679
Epoch 11/100
140/140 - 0s - 3ms/step - loss: 0.5548 - recall: 0.8842 - val_loss: 0.2268 - val_recall: 0.8769
Epoch 12/100
140/140 - 0s - 3ms/step - loss: 0.5287 - recall: 0.8867 - val_loss: 0.2207 - val_recall: 0.8769
Epoch 13/100
140/140 - 0s - 3ms/step - loss: 0.5163 - recall: 0.8893 - val_loss: 0.2128 - val_recall: 0.8799
Epoch 14/100
140/140 - 0s - 2ms/step - loss: 0.5115 - recall: 0.8829 - val_loss: 0.2087 - val_recall: 0.8829
Epoch 15/100
140/140 - 0s - 2ms/step - loss: 0.5105 - recall: 0.8777 - val_loss: 0.2029 - val_recall: 0.8799
Epoch 16/100
140/140 - 0s - 2ms/step - loss: 0.4880 - recall: 0.8919 - val_loss: 0.1971 - val_recall: 0.8769
Epoch 17/100
140/140 - 0s - 2ms/step - loss: 0.4867 - recall: 0.8867 - val_loss: 0.1912 - val_recall: 0.8739
Epoch 18/100
140/140 - 0s - 2ms/step - loss: 0.4859 - recall: 0.8777 - val_loss: 0.1889 - val_recall: 0.8799
Epoch 19/100
140/140 - 0s - 2ms/step - loss: 0.4774 - recall: 0.8919 - val_loss: 0.1849 - val_recall: 0.8769
Epoch 20/100
140/140 - 0s - 2ms/step - loss: 0.4610 - recall: 0.8919 - val_loss: 0.1802 - val_recall: 0.8769
Epoch 21/100
140/140 - 0s - 2ms/step - loss: 0.4479 - recall: 0.8958 - val_loss: 0.1736 - val_recall: 0.8769
Epoch 22/100
140/140 - 0s - 2ms/step - loss: 0.4510 - recall: 0.8983 - val_loss: 0.1723 - val_recall: 0.8769
Epoch 23/100
140/140 - 0s - 2ms/step - loss: 0.4385 - recall: 0.9022 - val_loss: 0.1697 - val_recall: 0.8799
Epoch 24/100
140/140 - 0s - 2ms/step - loss: 0.4432 - recall: 0.8970 - val_loss: 0.1694 - val_recall: 0.8829
Epoch 25/100
140/140 - 0s - 2ms/step - loss: 0.4351 - recall: 0.8958 - val_loss: 0.1653 - val_recall: 0.8829
Epoch 26/100
140/140 - 0s - 2ms/step - loss: 0.4167 - recall: 0.9022 - val_loss: 0.1644 - val_recall: 0.8829
Epoch 27/100
140/140 - 0s - 2ms/step - loss: 0.4187 - recall: 0.9048 - val_loss: 0.1605 - val_recall: 0.8829
Epoch 28/100
140/140 - 0s - 2ms/step - loss: 0.4046 - recall: 0.9099 - val_loss: 0.1562 - val_recall: 0.8769
Epoch 29/100
140/140 - 0s - 2ms/step - loss: 0.4053 - recall: 0.8983 - val_loss: 0.1558 - val_recall: 0.8799
Epoch 30/100
140/140 - 0s - 3ms/step - loss: 0.3956 - recall: 0.9060 - val_loss: 0.1527 - val_recall: 0.8769
Epoch 31/100
140/140 - 0s - 2ms/step - loss: 0.4006 - recall: 0.9009 - val_loss: 0.1521 - val_recall: 0.8799
Epoch 32/100
140/140 - 0s - 2ms/step - loss: 0.4060 - recall: 0.9022 - val_loss: 0.1509 - val_recall: 0.8799
Epoch 33/100
140/140 - 0s - 2ms/step - loss: 0.3848 - recall: 0.9138 - val_loss: 0.1489 - val_recall: 0.8799
Epoch 34/100
140/140 - 0s - 2ms/step - loss: 0.3852 - recall: 0.9060 - val_loss: 0.1461 - val_recall: 0.8769
Epoch 35/100
140/140 - 0s - 2ms/step - loss: 0.3756 - recall: 0.9060 - val_loss: 0.1447 - val_recall: 0.8799
Epoch 36/100
140/140 - 0s - 2ms/step - loss: 0.3805 - recall: 0.9022 - val_loss: 0.1447 - val_recall: 0.8799
Epoch 37/100
140/140 - 0s - 2ms/step - loss: 0.3709 - recall: 0.9112 - val_loss: 0.1419 - val_recall: 0.8799
Epoch 38/100
140/140 - 0s - 2ms/step - loss: 0.3697 - recall: 0.9048 - val_loss: 0.1436 - val_recall: 0.8799
Epoch 39/100
140/140 - 0s - 2ms/step - loss: 0.3657 - recall: 0.9112 - val_loss: 0.1412 - val_recall: 0.8799
Epoch 40/100
140/140 - 0s - 2ms/step - loss: 0.3525 - recall: 0.9125 - val_loss: 0.1374 - val_recall: 0.8799
Epoch 41/100
140/140 - 0s - 2ms/step - loss: 0.3596 - recall: 0.9035 - val_loss: 0.1378 - val_recall: 0.8799
Epoch 42/100
140/140 - 0s - 2ms/step - loss: 0.3497 - recall: 0.9086 - val_loss: 0.1357 - val_recall: 0.8799
Epoch 43/100
140/140 - 0s - 2ms/step - loss: 0.3571 - recall: 0.9073 - val_loss: 0.1383 - val_recall: 0.8799
Epoch 44/100
140/140 - 0s - 2ms/step - loss: 0.3662 - recall: 0.9125 - val_loss: 0.1395 - val_recall: 0.8799
Epoch 45/100
140/140 - 0s - 3ms/step - loss: 0.3604 - recall: 0.9022 - val_loss: 0.1397 - val_recall: 0.8799
Epoch 46/100
140/140 - 0s - 3ms/step - loss: 0.3587 - recall: 0.9176 - val_loss: 0.1354 - val_recall: 0.8769
Epoch 47/100
140/140 - 0s - 3ms/step - loss: 0.3530 - recall: 0.9112 - val_loss: 0.1344 - val_recall: 0.8799
Epoch 48/100
140/140 - 1s - 5ms/step - loss: 0.3400 - recall: 0.9125 - val_loss: 0.1313 - val_recall: 0.8769
Epoch 49/100
140/140 - 0s - 2ms/step - loss: 0.3387 - recall: 0.9202 - val_loss: 0.1318 - val_recall: 0.8769
Epoch 50/100
140/140 - 0s - 2ms/step - loss: 0.3397 - recall: 0.9138 - val_loss: 0.1317 - val_recall: 0.8769
Epoch 51/100
140/140 - 0s - 2ms/step - loss: 0.3446 - recall: 0.9125 - val_loss: 0.1325 - val_recall: 0.8769
Epoch 52/100
140/140 - 0s - 2ms/step - loss: 0.3437 - recall: 0.9138 - val_loss: 0.1321 - val_recall: 0.8769
Epoch 53/100
140/140 - 0s - 2ms/step - loss: 0.3334 - recall: 0.9151 - val_loss: 0.1333 - val_recall: 0.8799
Epoch 54/100
140/140 - 0s - 2ms/step - loss: 0.3314 - recall: 0.9125 - val_loss: 0.1294 - val_recall: 0.8799
Epoch 55/100
140/140 - 0s - 2ms/step - loss: 0.3217 - recall: 0.9151 - val_loss: 0.1281 - val_recall: 0.8799
Epoch 56/100
140/140 - 0s - 2ms/step - loss: 0.3316 - recall: 0.9073 - val_loss: 0.1290 - val_recall: 0.8799
Epoch 57/100
140/140 - 0s - 2ms/step - loss: 0.3248 - recall: 0.9151 - val_loss: 0.1280 - val_recall: 0.8799
Epoch 58/100
140/140 - 0s - 2ms/step - loss: 0.3313 - recall: 0.9138 - val_loss: 0.1290 - val_recall: 0.8799
Epoch 59/100
140/140 - 0s - 2ms/step - loss: 0.3252 - recall: 0.9228 - val_loss: 0.1277 - val_recall: 0.8799
Epoch 60/100
140/140 - 0s - 2ms/step - loss: 0.3350 - recall: 0.9112 - val_loss: 0.1274 - val_recall: 0.8799
Epoch 61/100
140/140 - 0s - 2ms/step - loss: 0.3210 - recall: 0.9151 - val_loss: 0.1276 - val_recall: 0.8799
Epoch 62/100
140/140 - 0s - 2ms/step - loss: 0.3233 - recall: 0.9176 - val_loss: 0.1255 - val_recall: 0.8799
Epoch 63/100
140/140 - 0s - 2ms/step - loss: 0.3232 - recall: 0.9125 - val_loss: 0.1269 - val_recall: 0.8829
Epoch 64/100
140/140 - 0s - 2ms/step - loss: 0.3098 - recall: 0.9151 - val_loss: 0.1260 - val_recall: 0.8829
Epoch 65/100
140/140 - 0s - 2ms/step - loss: 0.3086 - recall: 0.9112 - val_loss: 0.1248 - val_recall: 0.8829
Epoch 66/100
140/140 - 0s - 2ms/step - loss: 0.3081 - recall: 0.9176 - val_loss: 0.1236 - val_recall: 0.8829
Epoch 67/100
140/140 - 0s - 2ms/step - loss: 0.2948 - recall: 0.9215 - val_loss: 0.1216 - val_recall: 0.8829
Epoch 68/100
140/140 - 0s - 2ms/step - loss: 0.2995 - recall: 0.9202 - val_loss: 0.1208 - val_recall: 0.8829
Epoch 69/100
140/140 - 0s - 2ms/step - loss: 0.3081 - recall: 0.9138 - val_loss: 0.1249 - val_recall: 0.8829
Epoch 70/100
140/140 - 0s - 2ms/step - loss: 0.3040 - recall: 0.9202 - val_loss: 0.1255 - val_recall: 0.8829
Epoch 71/100
140/140 - 0s - 2ms/step - loss: 0.3024 - recall: 0.9176 - val_loss: 0.1212 - val_recall: 0.8829
Epoch 72/100
140/140 - 0s - 2ms/step - loss: 0.2999 - recall: 0.9163 - val_loss: 0.1226 - val_recall: 0.8829
Epoch 73/100
140/140 - 0s - 2ms/step - loss: 0.2949 - recall: 0.9215 - val_loss: 0.1215 - val_recall: 0.8829
Epoch 74/100
140/140 - 0s - 2ms/step - loss: 0.3019 - recall: 0.9099 - val_loss: 0.1223 - val_recall: 0.8829
Epoch 75/100
140/140 - 1s - 4ms/step - loss: 0.2997 - recall: 0.9151 - val_loss: 0.1229 - val_recall: 0.8829
Epoch 76/100
140/140 - 0s - 2ms/step - loss: 0.2963 - recall: 0.9215 - val_loss: 0.1207 - val_recall: 0.8829
Epoch 77/100
140/140 - 0s - 2ms/step - loss: 0.2981 - recall: 0.9254 - val_loss: 0.1196 - val_recall: 0.8859
Epoch 78/100
140/140 - 0s - 2ms/step - loss: 0.2965 - recall: 0.9215 - val_loss: 0.1189 - val_recall: 0.8859
Epoch 79/100
140/140 - 0s - 2ms/step - loss: 0.2950 - recall: 0.9266 - val_loss: 0.1188 - val_recall: 0.8859
Epoch 80/100
140/140 - 0s - 2ms/step - loss: 0.2938 - recall: 0.9241 - val_loss: 0.1195 - val_recall: 0.8859
Epoch 81/100
140/140 - 0s - 3ms/step - loss: 0.2899 - recall: 0.9228 - val_loss: 0.1192 - val_recall: 0.8859
Epoch 82/100
140/140 - 0s - 3ms/step - loss: 0.2895 - recall: 0.9266 - val_loss: 0.1184 - val_recall: 0.8859
Epoch 83/100
140/140 - 0s - 3ms/step - loss: 0.2854 - recall: 0.9266 - val_loss: 0.1157 - val_recall: 0.8859
Epoch 84/100
140/140 - 0s - 3ms/step - loss: 0.2785 - recall: 0.9266 - val_loss: 0.1143 - val_recall: 0.8859
Epoch 85/100
140/140 - 0s - 3ms/step - loss: 0.2735 - recall: 0.9266 - val_loss: 0.1131 - val_recall: 0.8859
Epoch 86/100
140/140 - 0s - 2ms/step - loss: 0.2919 - recall: 0.9086 - val_loss: 0.1196 - val_recall: 0.8859
Epoch 87/100
140/140 - 0s - 2ms/step - loss: 0.2727 - recall: 0.9279 - val_loss: 0.1164 - val_recall: 0.8859
Epoch 88/100
140/140 - 0s - 2ms/step - loss: 0.2718 - recall: 0.9228 - val_loss: 0.1144 - val_recall: 0.8859
Epoch 89/100
140/140 - 0s - 3ms/step - loss: 0.2734 - recall: 0.9266 - val_loss: 0.1148 - val_recall: 0.8859
Epoch 90/100
140/140 - 0s - 2ms/step - loss: 0.2773 - recall: 0.9215 - val_loss: 0.1166 - val_recall: 0.8859
Epoch 91/100
140/140 - 0s - 2ms/step - loss: 0.2741 - recall: 0.9254 - val_loss: 0.1143 - val_recall: 0.8859
Epoch 92/100
140/140 - 0s - 2ms/step - loss: 0.2699 - recall: 0.9254 - val_loss: 0.1132 - val_recall: 0.8859
Epoch 93/100
140/140 - 0s - 2ms/step - loss: 0.2861 - recall: 0.9254 - val_loss: 0.1169 - val_recall: 0.8859
Epoch 94/100
140/140 - 0s - 2ms/step - loss: 0.2675 - recall: 0.9292 - val_loss: 0.1142 - val_recall: 0.8859
Epoch 95/100
140/140 - 0s - 2ms/step - loss: 0.2722 - recall: 0.9202 - val_loss: 0.1136 - val_recall: 0.8859
Epoch 96/100
140/140 - 0s - 2ms/step - loss: 0.2805 - recall: 0.9266 - val_loss: 0.1149 - val_recall: 0.8859
Epoch 97/100
140/140 - 0s - 2ms/step - loss: 0.2634 - recall: 0.9254 - val_loss: 0.1142 - val_recall: 0.8859
Epoch 98/100
140/140 - 0s - 2ms/step - loss: 0.2779 - recall: 0.9279 - val_loss: 0.1137 - val_recall: 0.8859
Epoch 99/100
140/140 - 0s - 2ms/step - loss: 0.2694 - recall: 0.9228 - val_loss: 0.1137 - val_recall: 0.8859
Epoch 100/100
140/140 - 0s - 2ms/step - loss: 0.2739 - recall: 0.9215 - val_loss: 0.1126 - val_recall: 0.8859
In [112]:
#plot model's loss
plot(history, 'loss')
No description has been provided for this image
In [114]:
#plot model's recall
plot(history, 'recall')
No description has been provided for this image
  • Validation and training loss fall dramatically until about the 10th epoch before slowing and plateauing
  • Training loss oscillates a bit after that
  • Validation loss remains consistently lower than training loss the whole time
  • For recall, both see dramatic increases until the 10th epoch
  • Then, training recall continues rising, with oscillations *Validation recall continues rising less dramatically before mostly plateauing after the 20th epoch
In [115]:
#add model to our results df
results.loc[10] = [
  2, #hidden layers
  [64, 128], #neurons/layer
  ["relu", "relu"], #activation function
  epochs, #epochs
  batch_size, #batch size
  "Adam", #optimizer
  [0.0001, "-"], # learning rate, momentum
  ["He", "He", "Xav"], #weight initializer
  ["Batch norm", "Dropout (0.2), Class weights"], #regularization
  history.history["loss"][-1], #train loss
  history.history["val_loss"][-1], #val loss
  history.history["recall"][-1], #train recall
  history.history["val_recall"][-1], #val recall
  round(end-start,2) #trt
  ]
results
Out[115]:
# hidden layers # neurons - hidden layer activation function - hidden layer # epochs batch size optimizer learning rate, momentum weight initializer regularization train loss validation loss train recall validation recall time (secs)
0 1 64 relu 50 100 SGD [0.001, -] - - 0.051616 0.066721 0.832690 0.804805 14.10
1 2 [64, 128] [relu, relu] 50 100 SGD [0.001, -] [Xav, Xav, Xav] - 0.043756 0.060069 0.858430 0.840841 14.28
2 2 [64, 64] [relu, tanh] 50 100 SGD [0.001, -] [Xav, Xav, Xav] - 0.042877 0.054599 0.848134 0.828829 14.53
3 2 [64, 128] [relu, relu] 50 100 SGD with mom [0.001, 0.9] [Xav, Xav, Xav] - 0.019997 0.058326 0.925354 0.870871 16.27
4 2 [64, 128] [relu, relu] 50 100 SGD with mom [0.0001, 0.9] [Xav, Xav, Xav] - 0.085880 0.092282 0.646075 0.672673 15.04
5 2 [64, 128] [relu, relu] 50 100 Adam [0.001, -] [Xav, Xav, Xav] - 0.007018 0.099477 0.963964 0.867868 15.84
6 2 [64, 128] [relu, relu] 100 100 Adam [0.0001, -] [Xav, Xav, Xav] - 0.022984 0.056675 0.909910 0.861862 31.01
7 2 [64, 128] [relu, relu] 100 100 Adam [0.0001, -] [Xav, Xav, Xav] Dropout (0.2) 0.037275 0.052004 0.870013 0.873874 31.34
8 2 [64, 128] [relu, relu] 100 100 Adam [0.0001, -] [Xav, Xav, Xav] [Batch norm, Dropout (0.2)] 0.033385 0.048827 0.885457 0.873874 35.42
9 2 [64, 128] [relu, relu] 100 100 Adam [0.0001, -] [Xav, Xav, Xav] [Batch norm, Dropout (0.2), Class weights] 0.228398 0.102819 0.935650 0.888889 34.58
10 2 [64, 128] [relu, relu] 100 100 Adam [0.0001, -] [He, He, Xav] [Batch norm, Dropout (0.2), Class weights] 0.273939 0.112600 0.921493 0.885886 34.35
In [192]:
model_10_train_perf = model_performance_classification(model10, X_train, y_train)
model_10_train_perf
438/438 ━━━━━━━━━━━━━━━━━━━━ 0s 766us/step
Out[192]:
Accuracy Recall Precision F1 Score
0 0.985571 0.962683 0.911299 0.935249
In [193]:
model_10_val_perf = model_performance_classification(model10, X_val, y_val)
model_10_val_perf
188/188 ━━━━━━━━━━━━━━━━━━━━ 0s 780us/step
Out[193]:
Accuracy Recall Precision F1 Score
0 0.9825 0.937032 0.904089 0.919816

Model 11¶

Plan:

  • Adjusting class weights for imbalanced class distribution
  • Two hidden layers -- 64, 128
  • activation function -- relu, relu
  • Adam optimizer
  • Reduced learning rate of 1e-4 and increase epochs to 100
  • L2 regularization
In [116]:
#calculate class weights
cw = (y_train.shape[0]) / np.bincount(y_train)

#create a dictionary with class indices and their weights
cw_dict = {}
for i in range(cw.shape[0]):
    cw_dict[i] = cw[i]

cw_dict
Out[116]:
{0: np.float64(1.0587612493382743), 1: np.float64(18.01801801801802)}
In [117]:
#initialize model

#clear keras session
tf.keras.backend.clear_session()

#initialize neural network
model11 = Sequential()

#hidden layer
model11.add(Dense(64, activation = 'relu', input_shape = (X_train.shape[1],), kernel_regularizer=regularizers.l2(0.001)))

#hidden layer
model11.add(Dense(128, activation = 'relu', kernel_regularizer=regularizers.l2(0.001)))

#output layer -- one node for binary target variable, and sigmoid for a binary classification problem
model11.add(Dense(1, activation = 'sigmoid')) #leaving default Xavier initialization for the output layer since it's sigmoid
In [118]:
#looking at model details
model11.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense (Dense)                   │ (None, 64)             │         2,624 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 128)            │         8,320 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_2 (Dense)                 │ (None, 1)              │           129 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 11,073 (43.25 KB)
 Trainable params: 11,073 (43.25 KB)
 Non-trainable params: 0 (0.00 B)
In [119]:
#defining optimizer
lr = 1e-4
optimizer = keras.optimizers.Adam(learning_rate = lr)

#compile model, set optimizer and loss function -- binary crossentropy for binary target variable, with recall as metric
model11.compile(optimizer = optimizer, loss = 'binary_crossentropy', metrics = ['recall'])
In [120]:
#defining batch size and epochs
batch_size = 100
epochs = 100

#fitting model
start = time.time()
history = model11.fit(X_train, y_train, validation_data = (X_val, y_val), batch_size = batch_size, epochs = epochs, class_weight = cw_dict, verbose = 2)
end=time.time()
Epoch 1/100
140/140 - 1s - 10ms/step - loss: 1.1971 - recall: 0.8571 - val_loss: 0.5610 - val_recall: 0.8559
Epoch 2/100
140/140 - 0s - 2ms/step - loss: 0.8114 - recall: 0.8932 - val_loss: 0.4556 - val_recall: 0.8589
Epoch 3/100
140/140 - 0s - 2ms/step - loss: 0.7157 - recall: 0.9048 - val_loss: 0.4122 - val_recall: 0.8649
Epoch 4/100
140/140 - 0s - 2ms/step - loss: 0.6675 - recall: 0.9099 - val_loss: 0.3877 - val_recall: 0.8649
Epoch 5/100
140/140 - 0s - 2ms/step - loss: 0.6356 - recall: 0.9099 - val_loss: 0.3711 - val_recall: 0.8709
Epoch 6/100
140/140 - 0s - 2ms/step - loss: 0.6109 - recall: 0.9112 - val_loss: 0.3580 - val_recall: 0.8679
Epoch 7/100
140/140 - 0s - 2ms/step - loss: 0.5905 - recall: 0.9086 - val_loss: 0.3476 - val_recall: 0.8619
Epoch 8/100
140/140 - 0s - 2ms/step - loss: 0.5734 - recall: 0.9086 - val_loss: 0.3391 - val_recall: 0.8619
Epoch 9/100
140/140 - 0s - 2ms/step - loss: 0.5587 - recall: 0.9086 - val_loss: 0.3314 - val_recall: 0.8649
Epoch 10/100
140/140 - 0s - 2ms/step - loss: 0.5455 - recall: 0.9112 - val_loss: 0.3248 - val_recall: 0.8649
Epoch 11/100
140/140 - 0s - 2ms/step - loss: 0.5340 - recall: 0.9112 - val_loss: 0.3191 - val_recall: 0.8649
Epoch 12/100
140/140 - 0s - 2ms/step - loss: 0.5234 - recall: 0.9112 - val_loss: 0.3135 - val_recall: 0.8649
Epoch 13/100
140/140 - 0s - 2ms/step - loss: 0.5138 - recall: 0.9125 - val_loss: 0.3088 - val_recall: 0.8649
Epoch 14/100
140/140 - 0s - 2ms/step - loss: 0.5049 - recall: 0.9125 - val_loss: 0.3043 - val_recall: 0.8679
Epoch 15/100
140/140 - 0s - 2ms/step - loss: 0.4966 - recall: 0.9138 - val_loss: 0.3003 - val_recall: 0.8679
Epoch 16/100
140/140 - 0s - 2ms/step - loss: 0.4890 - recall: 0.9138 - val_loss: 0.2967 - val_recall: 0.8679
Epoch 17/100
140/140 - 0s - 2ms/step - loss: 0.4817 - recall: 0.9138 - val_loss: 0.2935 - val_recall: 0.8679
Epoch 18/100
140/140 - 0s - 2ms/step - loss: 0.4751 - recall: 0.9138 - val_loss: 0.2904 - val_recall: 0.8679
Epoch 19/100
140/140 - 0s - 2ms/step - loss: 0.4688 - recall: 0.9163 - val_loss: 0.2876 - val_recall: 0.8709
Epoch 20/100
140/140 - 0s - 2ms/step - loss: 0.4629 - recall: 0.9189 - val_loss: 0.2851 - val_recall: 0.8709
Epoch 21/100
140/140 - 0s - 2ms/step - loss: 0.4572 - recall: 0.9189 - val_loss: 0.2822 - val_recall: 0.8709
Epoch 22/100
140/140 - 0s - 2ms/step - loss: 0.4517 - recall: 0.9202 - val_loss: 0.2801 - val_recall: 0.8709
Epoch 23/100
140/140 - 0s - 2ms/step - loss: 0.4465 - recall: 0.9202 - val_loss: 0.2779 - val_recall: 0.8709
Epoch 24/100
140/140 - 0s - 2ms/step - loss: 0.4416 - recall: 0.9202 - val_loss: 0.2758 - val_recall: 0.8709
Epoch 25/100
140/140 - 0s - 2ms/step - loss: 0.4369 - recall: 0.9215 - val_loss: 0.2738 - val_recall: 0.8709
Epoch 26/100
140/140 - 0s - 2ms/step - loss: 0.4325 - recall: 0.9228 - val_loss: 0.2716 - val_recall: 0.8709
Epoch 27/100
140/140 - 0s - 2ms/step - loss: 0.4282 - recall: 0.9228 - val_loss: 0.2700 - val_recall: 0.8739
Epoch 28/100
140/140 - 0s - 2ms/step - loss: 0.4240 - recall: 0.9228 - val_loss: 0.2683 - val_recall: 0.8799
Epoch 29/100
140/140 - 0s - 2ms/step - loss: 0.4200 - recall: 0.9228 - val_loss: 0.2666 - val_recall: 0.8799
Epoch 30/100
140/140 - 0s - 2ms/step - loss: 0.4162 - recall: 0.9228 - val_loss: 0.2654 - val_recall: 0.8799
Epoch 31/100
140/140 - 0s - 2ms/step - loss: 0.4124 - recall: 0.9228 - val_loss: 0.2635 - val_recall: 0.8799
Epoch 32/100
140/140 - 0s - 2ms/step - loss: 0.4088 - recall: 0.9228 - val_loss: 0.2620 - val_recall: 0.8799
Epoch 33/100
140/140 - 0s - 3ms/step - loss: 0.4052 - recall: 0.9228 - val_loss: 0.2609 - val_recall: 0.8799
Epoch 34/100
140/140 - 0s - 3ms/step - loss: 0.4016 - recall: 0.9228 - val_loss: 0.2593 - val_recall: 0.8799
Epoch 35/100
140/140 - 0s - 3ms/step - loss: 0.3981 - recall: 0.9254 - val_loss: 0.2581 - val_recall: 0.8799
Epoch 36/100
140/140 - 1s - 5ms/step - loss: 0.3950 - recall: 0.9266 - val_loss: 0.2568 - val_recall: 0.8799
Epoch 37/100
140/140 - 0s - 2ms/step - loss: 0.3918 - recall: 0.9254 - val_loss: 0.2554 - val_recall: 0.8799
Epoch 38/100
140/140 - 0s - 2ms/step - loss: 0.3887 - recall: 0.9254 - val_loss: 0.2545 - val_recall: 0.8799
Epoch 39/100
140/140 - 0s - 2ms/step - loss: 0.3858 - recall: 0.9266 - val_loss: 0.2533 - val_recall: 0.8799
Epoch 40/100
140/140 - 0s - 2ms/step - loss: 0.3828 - recall: 0.9266 - val_loss: 0.2524 - val_recall: 0.8799
Epoch 41/100
140/140 - 0s - 2ms/step - loss: 0.3799 - recall: 0.9266 - val_loss: 0.2510 - val_recall: 0.8799
Epoch 42/100
140/140 - 0s - 2ms/step - loss: 0.3772 - recall: 0.9266 - val_loss: 0.2502 - val_recall: 0.8799
Epoch 43/100
140/140 - 0s - 2ms/step - loss: 0.3744 - recall: 0.9266 - val_loss: 0.2491 - val_recall: 0.8799
Epoch 44/100
140/140 - 0s - 2ms/step - loss: 0.3717 - recall: 0.9266 - val_loss: 0.2481 - val_recall: 0.8799
Epoch 45/100
140/140 - 0s - 2ms/step - loss: 0.3691 - recall: 0.9266 - val_loss: 0.2470 - val_recall: 0.8799
Epoch 46/100
140/140 - 0s - 2ms/step - loss: 0.3666 - recall: 0.9266 - val_loss: 0.2463 - val_recall: 0.8799
Epoch 47/100
140/140 - 0s - 2ms/step - loss: 0.3641 - recall: 0.9266 - val_loss: 0.2453 - val_recall: 0.8799
Epoch 48/100
140/140 - 0s - 2ms/step - loss: 0.3616 - recall: 0.9279 - val_loss: 0.2442 - val_recall: 0.8799
Epoch 49/100
140/140 - 0s - 2ms/step - loss: 0.3592 - recall: 0.9279 - val_loss: 0.2432 - val_recall: 0.8799
Epoch 50/100
140/140 - 0s - 2ms/step - loss: 0.3568 - recall: 0.9292 - val_loss: 0.2425 - val_recall: 0.8799
Epoch 51/100
140/140 - 0s - 2ms/step - loss: 0.3545 - recall: 0.9305 - val_loss: 0.2415 - val_recall: 0.8799
Epoch 52/100
140/140 - 0s - 2ms/step - loss: 0.3522 - recall: 0.9305 - val_loss: 0.2406 - val_recall: 0.8769
Epoch 53/100
140/140 - 0s - 2ms/step - loss: 0.3499 - recall: 0.9318 - val_loss: 0.2397 - val_recall: 0.8769
Epoch 54/100
140/140 - 0s - 2ms/step - loss: 0.3478 - recall: 0.9318 - val_loss: 0.2391 - val_recall: 0.8799
Epoch 55/100
140/140 - 0s - 2ms/step - loss: 0.3457 - recall: 0.9318 - val_loss: 0.2381 - val_recall: 0.8829
Epoch 56/100
140/140 - 0s - 2ms/step - loss: 0.3434 - recall: 0.9318 - val_loss: 0.2372 - val_recall: 0.8829
Epoch 57/100
140/140 - 0s - 2ms/step - loss: 0.3414 - recall: 0.9318 - val_loss: 0.2367 - val_recall: 0.8829
Epoch 58/100
140/140 - 0s - 2ms/step - loss: 0.3394 - recall: 0.9344 - val_loss: 0.2359 - val_recall: 0.8829
Epoch 59/100
140/140 - 0s - 2ms/step - loss: 0.3372 - recall: 0.9344 - val_loss: 0.2349 - val_recall: 0.8829
Epoch 60/100
140/140 - 0s - 2ms/step - loss: 0.3353 - recall: 0.9344 - val_loss: 0.2343 - val_recall: 0.8829
Epoch 61/100
140/140 - 0s - 2ms/step - loss: 0.3334 - recall: 0.9356 - val_loss: 0.2334 - val_recall: 0.8829
Epoch 62/100
140/140 - 0s - 2ms/step - loss: 0.3314 - recall: 0.9356 - val_loss: 0.2325 - val_recall: 0.8829
Epoch 63/100
140/140 - 0s - 2ms/step - loss: 0.3295 - recall: 0.9369 - val_loss: 0.2319 - val_recall: 0.8829
Epoch 64/100
140/140 - 0s - 2ms/step - loss: 0.3276 - recall: 0.9369 - val_loss: 0.2312 - val_recall: 0.8829
Epoch 65/100
140/140 - 0s - 2ms/step - loss: 0.3257 - recall: 0.9382 - val_loss: 0.2304 - val_recall: 0.8829
Epoch 66/100
140/140 - 0s - 2ms/step - loss: 0.3238 - recall: 0.9382 - val_loss: 0.2296 - val_recall: 0.8829
Epoch 67/100
140/140 - 0s - 2ms/step - loss: 0.3221 - recall: 0.9395 - val_loss: 0.2290 - val_recall: 0.8829
Epoch 68/100
140/140 - 0s - 2ms/step - loss: 0.3203 - recall: 0.9395 - val_loss: 0.2281 - val_recall: 0.8829
Epoch 69/100
140/140 - 0s - 2ms/step - loss: 0.3186 - recall: 0.9395 - val_loss: 0.2278 - val_recall: 0.8829
Epoch 70/100
140/140 - 0s - 2ms/step - loss: 0.3168 - recall: 0.9395 - val_loss: 0.2267 - val_recall: 0.8829
Epoch 71/100
140/140 - 0s - 2ms/step - loss: 0.3152 - recall: 0.9395 - val_loss: 0.2265 - val_recall: 0.8829
Epoch 72/100
140/140 - 0s - 3ms/step - loss: 0.3135 - recall: 0.9395 - val_loss: 0.2259 - val_recall: 0.8829
Epoch 73/100
140/140 - 1s - 4ms/step - loss: 0.3118 - recall: 0.9395 - val_loss: 0.2252 - val_recall: 0.8829
Epoch 74/100
140/140 - 0s - 3ms/step - loss: 0.3102 - recall: 0.9395 - val_loss: 0.2244 - val_recall: 0.8829
Epoch 75/100
140/140 - 1s - 4ms/step - loss: 0.3085 - recall: 0.9408 - val_loss: 0.2238 - val_recall: 0.8829
Epoch 76/100
140/140 - 1s - 4ms/step - loss: 0.3070 - recall: 0.9408 - val_loss: 0.2233 - val_recall: 0.8829
Epoch 77/100
140/140 - 0s - 2ms/step - loss: 0.3053 - recall: 0.9408 - val_loss: 0.2226 - val_recall: 0.8859
Epoch 78/100
140/140 - 0s - 2ms/step - loss: 0.3037 - recall: 0.9421 - val_loss: 0.2216 - val_recall: 0.8829
Epoch 79/100
140/140 - 0s - 2ms/step - loss: 0.3022 - recall: 0.9421 - val_loss: 0.2218 - val_recall: 0.8829
Epoch 80/100
140/140 - 0s - 2ms/step - loss: 0.3008 - recall: 0.9421 - val_loss: 0.2207 - val_recall: 0.8829
Epoch 81/100
140/140 - 0s - 2ms/step - loss: 0.2992 - recall: 0.9434 - val_loss: 0.2200 - val_recall: 0.8829
Epoch 82/100
140/140 - 0s - 2ms/step - loss: 0.2977 - recall: 0.9434 - val_loss: 0.2200 - val_recall: 0.8829
Epoch 83/100
140/140 - 0s - 2ms/step - loss: 0.2961 - recall: 0.9434 - val_loss: 0.2189 - val_recall: 0.8829
Epoch 84/100
140/140 - 0s - 2ms/step - loss: 0.2947 - recall: 0.9434 - val_loss: 0.2184 - val_recall: 0.8829
Epoch 85/100
140/140 - 0s - 2ms/step - loss: 0.2931 - recall: 0.9434 - val_loss: 0.2178 - val_recall: 0.8859
Epoch 86/100
140/140 - 0s - 2ms/step - loss: 0.2918 - recall: 0.9434 - val_loss: 0.2175 - val_recall: 0.8859
Epoch 87/100
140/140 - 0s - 2ms/step - loss: 0.2902 - recall: 0.9434 - val_loss: 0.2170 - val_recall: 0.8859
Epoch 88/100
140/140 - 0s - 2ms/step - loss: 0.2889 - recall: 0.9434 - val_loss: 0.2163 - val_recall: 0.8859
Epoch 89/100
140/140 - 0s - 2ms/step - loss: 0.2873 - recall: 0.9434 - val_loss: 0.2158 - val_recall: 0.8859
Epoch 90/100
140/140 - 0s - 2ms/step - loss: 0.2861 - recall: 0.9434 - val_loss: 0.2154 - val_recall: 0.8859
Epoch 91/100
140/140 - 0s - 2ms/step - loss: 0.2846 - recall: 0.9434 - val_loss: 0.2146 - val_recall: 0.8859
Epoch 92/100
140/140 - 0s - 2ms/step - loss: 0.2832 - recall: 0.9447 - val_loss: 0.2141 - val_recall: 0.8859
Epoch 93/100
140/140 - 0s - 2ms/step - loss: 0.2819 - recall: 0.9459 - val_loss: 0.2136 - val_recall: 0.8859
Epoch 94/100
140/140 - 0s - 2ms/step - loss: 0.2805 - recall: 0.9459 - val_loss: 0.2130 - val_recall: 0.8859
Epoch 95/100
140/140 - 0s - 2ms/step - loss: 0.2791 - recall: 0.9447 - val_loss: 0.2125 - val_recall: 0.8859
Epoch 96/100
140/140 - 0s - 2ms/step - loss: 0.2780 - recall: 0.9459 - val_loss: 0.2121 - val_recall: 0.8859
Epoch 97/100
140/140 - 0s - 2ms/step - loss: 0.2765 - recall: 0.9472 - val_loss: 0.2115 - val_recall: 0.8859
Epoch 98/100
140/140 - 0s - 2ms/step - loss: 0.2753 - recall: 0.9472 - val_loss: 0.2109 - val_recall: 0.8859
Epoch 99/100
140/140 - 0s - 2ms/step - loss: 0.2741 - recall: 0.9485 - val_loss: 0.2107 - val_recall: 0.8859
Epoch 100/100
140/140 - 0s - 2ms/step - loss: 0.2729 - recall: 0.9485 - val_loss: 0.2100 - val_recall: 0.8859
In [121]:
#plot model's loss
plot(history, 'loss')
No description has been provided for this image
In [122]:
#plot model's recall
plot(history, 'recall')
No description has been provided for this image
  • Validation and training loss fall dramatically until about the 9th epoch, then slow down
  • Validation loss is consistently lower than training loss
  • For recall, validation increases greatly until around the 7th epoch, falls, then rises gradually before a flattening trend after the 30th epoch
  • Meantime, training recall increases dramatically until about the 7th epoch, then continues rising through the end
  • There is a wide gap between the two
In [123]:
#add model to our results df
results.loc[11] = [
  2, #hidden layers
  [64, 128], #neurons/layer
  ["relu", "relu"], #activation function
  epochs, #epochs
  batch_size, #batch size
  "Adam", #optimizer
  [0.0001, "-"], # learning rate, momentum
  ["Xav", "Xav", "Xav"], #weight initializer
  ["L2", "L2", "Class weights"], #regularization
  history.history["loss"][-1], #train loss
  history.history["val_loss"][-1], #val loss
  history.history["recall"][-1], #train recall
  history.history["val_recall"][-1], #val recall
  round(end-start,2) #trt
  ]
results
Out[123]:
# hidden layers # neurons - hidden layer activation function - hidden layer # epochs batch size optimizer learning rate, momentum weight initializer regularization train loss validation loss train recall validation recall time (secs)
0 1 64 relu 50 100 SGD [0.001, -] - - 0.051616 0.066721 0.832690 0.804805 14.10
1 2 [64, 128] [relu, relu] 50 100 SGD [0.001, -] [Xav, Xav, Xav] - 0.043756 0.060069 0.858430 0.840841 14.28
2 2 [64, 64] [relu, tanh] 50 100 SGD [0.001, -] [Xav, Xav, Xav] - 0.042877 0.054599 0.848134 0.828829 14.53
3 2 [64, 128] [relu, relu] 50 100 SGD with mom [0.001, 0.9] [Xav, Xav, Xav] - 0.019997 0.058326 0.925354 0.870871 16.27
4 2 [64, 128] [relu, relu] 50 100 SGD with mom [0.0001, 0.9] [Xav, Xav, Xav] - 0.085880 0.092282 0.646075 0.672673 15.04
5 2 [64, 128] [relu, relu] 50 100 Adam [0.001, -] [Xav, Xav, Xav] - 0.007018 0.099477 0.963964 0.867868 15.84
6 2 [64, 128] [relu, relu] 100 100 Adam [0.0001, -] [Xav, Xav, Xav] - 0.022984 0.056675 0.909910 0.861862 31.01
7 2 [64, 128] [relu, relu] 100 100 Adam [0.0001, -] [Xav, Xav, Xav] Dropout (0.2) 0.037275 0.052004 0.870013 0.873874 31.34
8 2 [64, 128] [relu, relu] 100 100 Adam [0.0001, -] [Xav, Xav, Xav] [Batch norm, Dropout (0.2)] 0.033385 0.048827 0.885457 0.873874 35.42
9 2 [64, 128] [relu, relu] 100 100 Adam [0.0001, -] [Xav, Xav, Xav] [Batch norm, Dropout (0.2), Class weights] 0.228398 0.102819 0.935650 0.888889 34.58
10 2 [64, 128] [relu, relu] 100 100 Adam [0.0001, -] [He, He, Xav] [Batch norm, Dropout (0.2), Class weights] 0.273939 0.112600 0.921493 0.885886 34.35
11 2 [64, 128] [relu, relu] 100 100 Adam [0.0001, -] [Xav, Xav, Xav] [L2, L2, Class weights] 0.272880 0.210015 0.948520 0.885886 31.58
In [194]:
model_11_train_perf = model_performance_classification(model11, X_train, y_train)
model_11_train_perf
438/438 ━━━━━━━━━━━━━━━━━━━━ 0s 753us/step
Out[194]:
Accuracy Recall Precision F1 Score
0 0.988286 0.973811 0.923886 0.947251
In [195]:
model_11_val_perf = model_performance_classification(model11, X_val, y_val)
model_11_val_perf
188/188 ━━━━━━━━━━━━━━━━━━━━ 0s 781us/step
Out[195]:
Accuracy Recall Precision F1 Score
0 0.982333 0.936943 0.902966 0.919162

Model 12¶

Plan:

  • Early stopping to find optimal version of the best model
  • Adjusting class weights for imbalanced class distribution
  • Two hidden layers -- 64, 128
  • activation function -- relu, relu
  • Adam optimizer *Reduced learning rate of 1e-4 and increase epochs to 800 for early stopping
  • Dropout of 0.2
  • Batch normalization
In [125]:
#calculate class weights
cw = (y_train.shape[0]) / np.bincount(y_train)

#create a dictionary with class indices and their weights
cw_dict = {}
for i in range(cw.shape[0]):
    cw_dict[i] = cw[i]

cw_dict
Out[125]:
{0: np.float64(1.0587612493382743), 1: np.float64(18.01801801801802)}
In [126]:
#setting the dropout rate
dropout_rate = 0.2
In [127]:
#build Model 9 again

#introduce early stopping callback -- so we get the best version of our best model
es= keras.callbacks.EarlyStopping(monitor='val_loss', min_delta=0, patience=15, verbose=1, mode='min', restore_best_weights= True)

#clear keras session
tf.keras.backend.clear_session()

#initialize neural network
model12 = Sequential()

#hidden layer
model12.add(Dense(64, activation = 'relu', input_shape = (X_train.shape[1],)))

#batch normalization
model12.add(BatchNormalization())

#dropout layer
model12.add(Dropout(dropout_rate))

#hidden layer
model12.add(Dense(128, activation = 'relu'))

#output layer -- one node for binary target variable, and sigmoid for a binary classification problem
model12.add(Dense(1, activation = 'sigmoid'))
In [128]:
#looking at model details
model12.summary()
Model: "sequential"
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━┓
┃ Layer (type)                    ┃ Output Shape           ┃       Param # ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━┩
│ dense (Dense)                   │ (None, 64)             │         2,624 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ batch_normalization             │ (None, 64)             │           256 │
│ (BatchNormalization)            │                        │               │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dropout (Dropout)               │ (None, 64)             │             0 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_1 (Dense)                 │ (None, 128)            │         8,320 │
├─────────────────────────────────┼────────────────────────┼───────────────┤
│ dense_2 (Dense)                 │ (None, 1)              │           129 │
└─────────────────────────────────┴────────────────────────┴───────────────┘
 Total params: 11,329 (44.25 KB)
 Trainable params: 11,201 (43.75 KB)
 Non-trainable params: 128 (512.00 B)
In [129]:
#defining optimizer
lr = 1e-4
optimizer = keras.optimizers.Adam(learning_rate = lr)

#compile model, set optimizer and loss function -- binary crossentropy for binary target variable, with recall as metric
model12.compile(optimizer = optimizer, loss = 'binary_crossentropy', metrics = ['recall'])
In [131]:
#defining batch size and epochs
batch_size = 100
epochs = 800 #high count combined with early stopping will help find best version of model

#fitting model
start = time.time()
history = model12.fit(X_train, y_train, validation_data = (X_val, y_val), batch_size = batch_size, epochs = epochs, callbacks = [es], verbose = 2)
end=time.time()
Epoch 1/800
140/140 - 2s - 11ms/step - loss: 0.3352 - recall: 0.2407 - val_loss: 0.2133 - val_recall: 0.3574
Epoch 2/800
140/140 - 0s - 2ms/step - loss: 0.1789 - recall: 0.2831 - val_loss: 0.1420 - val_recall: 0.4054
Epoch 3/800
140/140 - 0s - 2ms/step - loss: 0.1352 - recall: 0.3861 - val_loss: 0.1141 - val_recall: 0.4775
Epoch 4/800
140/140 - 0s - 3ms/step - loss: 0.1142 - recall: 0.4582 - val_loss: 0.0992 - val_recall: 0.5646
Epoch 5/800
140/140 - 0s - 3ms/step - loss: 0.1010 - recall: 0.5341 - val_loss: 0.0894 - val_recall: 0.6186
Epoch 6/800
140/140 - 0s - 3ms/step - loss: 0.0946 - recall: 0.5598 - val_loss: 0.0826 - val_recall: 0.6667
Epoch 7/800
140/140 - 1s - 4ms/step - loss: 0.0854 - recall: 0.6088 - val_loss: 0.0778 - val_recall: 0.7057
Epoch 8/800
140/140 - 0s - 3ms/step - loss: 0.0808 - recall: 0.6499 - val_loss: 0.0740 - val_recall: 0.7447
Epoch 9/800
140/140 - 0s - 2ms/step - loss: 0.0761 - recall: 0.6641 - val_loss: 0.0706 - val_recall: 0.7568
Epoch 10/800
140/140 - 0s - 2ms/step - loss: 0.0734 - recall: 0.6873 - val_loss: 0.0683 - val_recall: 0.7598
Epoch 11/800
140/140 - 0s - 2ms/step - loss: 0.0690 - recall: 0.7156 - val_loss: 0.0661 - val_recall: 0.7718
Epoch 12/800
140/140 - 0s - 2ms/step - loss: 0.0668 - recall: 0.7143 - val_loss: 0.0644 - val_recall: 0.7838
Epoch 13/800
140/140 - 0s - 2ms/step - loss: 0.0643 - recall: 0.7439 - val_loss: 0.0627 - val_recall: 0.7838
Epoch 14/800
140/140 - 0s - 2ms/step - loss: 0.0633 - recall: 0.7465 - val_loss: 0.0615 - val_recall: 0.7928
Epoch 15/800
140/140 - 0s - 2ms/step - loss: 0.0599 - recall: 0.7632 - val_loss: 0.0606 - val_recall: 0.8048
Epoch 16/800
140/140 - 0s - 2ms/step - loss: 0.0597 - recall: 0.7722 - val_loss: 0.0598 - val_recall: 0.8048
Epoch 17/800
140/140 - 0s - 2ms/step - loss: 0.0595 - recall: 0.7748 - val_loss: 0.0590 - val_recall: 0.8078
Epoch 18/800
140/140 - 0s - 2ms/step - loss: 0.0581 - recall: 0.7748 - val_loss: 0.0585 - val_recall: 0.8138
Epoch 19/800
140/140 - 0s - 2ms/step - loss: 0.0547 - recall: 0.8005 - val_loss: 0.0579 - val_recall: 0.8198
Epoch 20/800
140/140 - 0s - 2ms/step - loss: 0.0540 - recall: 0.8005 - val_loss: 0.0573 - val_recall: 0.8258
Epoch 21/800
140/140 - 0s - 2ms/step - loss: 0.0527 - recall: 0.8082 - val_loss: 0.0568 - val_recall: 0.8378
Epoch 22/800
140/140 - 0s - 2ms/step - loss: 0.0528 - recall: 0.8069 - val_loss: 0.0565 - val_recall: 0.8348
Epoch 23/800
140/140 - 0s - 2ms/step - loss: 0.0532 - recall: 0.8044 - val_loss: 0.0562 - val_recall: 0.8378
Epoch 24/800
140/140 - 0s - 2ms/step - loss: 0.0519 - recall: 0.8108 - val_loss: 0.0557 - val_recall: 0.8348
Epoch 25/800
140/140 - 0s - 2ms/step - loss: 0.0504 - recall: 0.8211 - val_loss: 0.0555 - val_recall: 0.8378
Epoch 26/800
140/140 - 0s - 2ms/step - loss: 0.0493 - recall: 0.8301 - val_loss: 0.0552 - val_recall: 0.8378
Epoch 27/800
140/140 - 0s - 2ms/step - loss: 0.0498 - recall: 0.8211 - val_loss: 0.0549 - val_recall: 0.8408
Epoch 28/800
140/140 - 0s - 2ms/step - loss: 0.0495 - recall: 0.8134 - val_loss: 0.0546 - val_recall: 0.8438
Epoch 29/800
140/140 - 0s - 2ms/step - loss: 0.0488 - recall: 0.8224 - val_loss: 0.0544 - val_recall: 0.8498
Epoch 30/800
140/140 - 0s - 2ms/step - loss: 0.0488 - recall: 0.8250 - val_loss: 0.0539 - val_recall: 0.8438
Epoch 31/800
140/140 - 0s - 2ms/step - loss: 0.0478 - recall: 0.8237 - val_loss: 0.0538 - val_recall: 0.8468
Epoch 32/800
140/140 - 0s - 2ms/step - loss: 0.0471 - recall: 0.8211 - val_loss: 0.0536 - val_recall: 0.8438
Epoch 33/800
140/140 - 0s - 2ms/step - loss: 0.0470 - recall: 0.8263 - val_loss: 0.0534 - val_recall: 0.8468
Epoch 34/800
140/140 - 0s - 2ms/step - loss: 0.0455 - recall: 0.8366 - val_loss: 0.0532 - val_recall: 0.8468
Epoch 35/800
140/140 - 0s - 2ms/step - loss: 0.0455 - recall: 0.8443 - val_loss: 0.0531 - val_recall: 0.8468
Epoch 36/800
140/140 - 0s - 2ms/step - loss: 0.0456 - recall: 0.8443 - val_loss: 0.0529 - val_recall: 0.8498
Epoch 37/800
140/140 - 0s - 2ms/step - loss: 0.0448 - recall: 0.8430 - val_loss: 0.0528 - val_recall: 0.8498
Epoch 38/800
140/140 - 0s - 2ms/step - loss: 0.0446 - recall: 0.8391 - val_loss: 0.0526 - val_recall: 0.8498
Epoch 39/800
140/140 - 0s - 3ms/step - loss: 0.0438 - recall: 0.8494 - val_loss: 0.0524 - val_recall: 0.8498
Epoch 40/800
140/140 - 1s - 4ms/step - loss: 0.0432 - recall: 0.8520 - val_loss: 0.0522 - val_recall: 0.8498
Epoch 41/800
140/140 - 1s - 5ms/step - loss: 0.0430 - recall: 0.8456 - val_loss: 0.0521 - val_recall: 0.8529
Epoch 42/800
140/140 - 0s - 3ms/step - loss: 0.0425 - recall: 0.8571 - val_loss: 0.0522 - val_recall: 0.8498
Epoch 43/800
140/140 - 0s - 2ms/step - loss: 0.0425 - recall: 0.8507 - val_loss: 0.0521 - val_recall: 0.8498
Epoch 44/800
140/140 - 0s - 2ms/step - loss: 0.0416 - recall: 0.8559 - val_loss: 0.0518 - val_recall: 0.8498
Epoch 45/800
140/140 - 0s - 2ms/step - loss: 0.0425 - recall: 0.8481 - val_loss: 0.0519 - val_recall: 0.8498
Epoch 46/800
140/140 - 0s - 2ms/step - loss: 0.0420 - recall: 0.8533 - val_loss: 0.0516 - val_recall: 0.8498
Epoch 47/800
140/140 - 0s - 2ms/step - loss: 0.0414 - recall: 0.8546 - val_loss: 0.0515 - val_recall: 0.8529
Epoch 48/800
140/140 - 0s - 2ms/step - loss: 0.0397 - recall: 0.8700 - val_loss: 0.0514 - val_recall: 0.8498
Epoch 49/800
140/140 - 0s - 2ms/step - loss: 0.0417 - recall: 0.8507 - val_loss: 0.0513 - val_recall: 0.8498
Epoch 50/800
140/140 - 0s - 2ms/step - loss: 0.0404 - recall: 0.8584 - val_loss: 0.0513 - val_recall: 0.8529
Epoch 51/800
140/140 - 0s - 2ms/step - loss: 0.0400 - recall: 0.8584 - val_loss: 0.0512 - val_recall: 0.8529
Epoch 52/800
140/140 - 0s - 2ms/step - loss: 0.0395 - recall: 0.8649 - val_loss: 0.0510 - val_recall: 0.8529
Epoch 53/800
140/140 - 0s - 3ms/step - loss: 0.0380 - recall: 0.8674 - val_loss: 0.0511 - val_recall: 0.8529
Epoch 54/800
140/140 - 0s - 2ms/step - loss: 0.0390 - recall: 0.8662 - val_loss: 0.0511 - val_recall: 0.8529
Epoch 55/800
140/140 - 0s - 2ms/step - loss: 0.0380 - recall: 0.8726 - val_loss: 0.0509 - val_recall: 0.8529
Epoch 56/800
140/140 - 0s - 3ms/step - loss: 0.0376 - recall: 0.8687 - val_loss: 0.0508 - val_recall: 0.8559
Epoch 57/800
140/140 - 0s - 2ms/step - loss: 0.0394 - recall: 0.8649 - val_loss: 0.0508 - val_recall: 0.8559
Epoch 58/800
140/140 - 0s - 2ms/step - loss: 0.0391 - recall: 0.8687 - val_loss: 0.0509 - val_recall: 0.8529
Epoch 59/800
140/140 - 0s - 2ms/step - loss: 0.0382 - recall: 0.8584 - val_loss: 0.0507 - val_recall: 0.8498
Epoch 60/800
140/140 - 0s - 2ms/step - loss: 0.0386 - recall: 0.8713 - val_loss: 0.0506 - val_recall: 0.8529
Epoch 61/800
140/140 - 0s - 2ms/step - loss: 0.0386 - recall: 0.8726 - val_loss: 0.0505 - val_recall: 0.8559
Epoch 62/800
140/140 - 0s - 2ms/step - loss: 0.0384 - recall: 0.8649 - val_loss: 0.0504 - val_recall: 0.8559
Epoch 63/800
140/140 - 0s - 2ms/step - loss: 0.0376 - recall: 0.8726 - val_loss: 0.0502 - val_recall: 0.8529
Epoch 64/800
140/140 - 0s - 2ms/step - loss: 0.0378 - recall: 0.8674 - val_loss: 0.0503 - val_recall: 0.8589
Epoch 65/800
140/140 - 0s - 2ms/step - loss: 0.0372 - recall: 0.8764 - val_loss: 0.0504 - val_recall: 0.8589
Epoch 66/800
140/140 - 0s - 2ms/step - loss: 0.0372 - recall: 0.8700 - val_loss: 0.0503 - val_recall: 0.8589
Epoch 67/800
140/140 - 0s - 2ms/step - loss: 0.0369 - recall: 0.8764 - val_loss: 0.0502 - val_recall: 0.8589
Epoch 68/800
140/140 - 0s - 2ms/step - loss: 0.0365 - recall: 0.8726 - val_loss: 0.0501 - val_recall: 0.8589
Epoch 69/800
140/140 - 0s - 2ms/step - loss: 0.0360 - recall: 0.8752 - val_loss: 0.0502 - val_recall: 0.8589
Epoch 70/800
140/140 - 0s - 2ms/step - loss: 0.0361 - recall: 0.8726 - val_loss: 0.0501 - val_recall: 0.8619
Epoch 71/800
140/140 - 0s - 2ms/step - loss: 0.0360 - recall: 0.8726 - val_loss: 0.0500 - val_recall: 0.8619
Epoch 72/800
140/140 - 0s - 2ms/step - loss: 0.0349 - recall: 0.8713 - val_loss: 0.0502 - val_recall: 0.8619
Epoch 73/800
140/140 - 0s - 2ms/step - loss: 0.0350 - recall: 0.8842 - val_loss: 0.0502 - val_recall: 0.8619
Epoch 74/800
140/140 - 0s - 3ms/step - loss: 0.0352 - recall: 0.8790 - val_loss: 0.0501 - val_recall: 0.8619
Epoch 75/800
140/140 - 0s - 3ms/step - loss: 0.0344 - recall: 0.8752 - val_loss: 0.0501 - val_recall: 0.8679
Epoch 76/800
140/140 - 0s - 3ms/step - loss: 0.0354 - recall: 0.8829 - val_loss: 0.0500 - val_recall: 0.8619
Epoch 77/800
140/140 - 0s - 4ms/step - loss: 0.0358 - recall: 0.8803 - val_loss: 0.0500 - val_recall: 0.8679
Epoch 78/800
140/140 - 0s - 2ms/step - loss: 0.0344 - recall: 0.8829 - val_loss: 0.0500 - val_recall: 0.8649
Epoch 79/800
140/140 - 0s - 2ms/step - loss: 0.0349 - recall: 0.8739 - val_loss: 0.0499 - val_recall: 0.8679
Epoch 80/800
140/140 - 0s - 2ms/step - loss: 0.0357 - recall: 0.8790 - val_loss: 0.0498 - val_recall: 0.8679
Epoch 81/800
140/140 - 0s - 3ms/step - loss: 0.0353 - recall: 0.8790 - val_loss: 0.0498 - val_recall: 0.8709
Epoch 82/800
140/140 - 0s - 2ms/step - loss: 0.0349 - recall: 0.8764 - val_loss: 0.0499 - val_recall: 0.8679
Epoch 83/800
140/140 - 0s - 2ms/step - loss: 0.0348 - recall: 0.8803 - val_loss: 0.0500 - val_recall: 0.8679
Epoch 84/800
140/140 - 0s - 2ms/step - loss: 0.0335 - recall: 0.8906 - val_loss: 0.0499 - val_recall: 0.8709
Epoch 85/800
140/140 - 0s - 2ms/step - loss: 0.0339 - recall: 0.8842 - val_loss: 0.0499 - val_recall: 0.8709
Epoch 86/800
140/140 - 0s - 2ms/step - loss: 0.0343 - recall: 0.8893 - val_loss: 0.0498 - val_recall: 0.8709
Epoch 87/800
140/140 - 0s - 2ms/step - loss: 0.0347 - recall: 0.8803 - val_loss: 0.0499 - val_recall: 0.8739
Epoch 88/800
140/140 - 0s - 2ms/step - loss: 0.0333 - recall: 0.8855 - val_loss: 0.0497 - val_recall: 0.8709
Epoch 89/800
140/140 - 0s - 2ms/step - loss: 0.0334 - recall: 0.8829 - val_loss: 0.0496 - val_recall: 0.8739
Epoch 90/800
140/140 - 0s - 2ms/step - loss: 0.0336 - recall: 0.8893 - val_loss: 0.0497 - val_recall: 0.8739
Epoch 91/800
140/140 - 0s - 2ms/step - loss: 0.0336 - recall: 0.8893 - val_loss: 0.0497 - val_recall: 0.8709
Epoch 92/800
140/140 - 0s - 2ms/step - loss: 0.0326 - recall: 0.8803 - val_loss: 0.0500 - val_recall: 0.8739
Epoch 93/800
140/140 - 0s - 2ms/step - loss: 0.0342 - recall: 0.8919 - val_loss: 0.0498 - val_recall: 0.8709
Epoch 94/800
140/140 - 0s - 2ms/step - loss: 0.0331 - recall: 0.8829 - val_loss: 0.0498 - val_recall: 0.8709
Epoch 95/800
140/140 - 0s - 2ms/step - loss: 0.0333 - recall: 0.8790 - val_loss: 0.0499 - val_recall: 0.8739
Epoch 96/800
140/140 - 0s - 2ms/step - loss: 0.0320 - recall: 0.8880 - val_loss: 0.0500 - val_recall: 0.8709
Epoch 97/800
140/140 - 0s - 2ms/step - loss: 0.0334 - recall: 0.8803 - val_loss: 0.0501 - val_recall: 0.8709
Epoch 98/800
140/140 - 0s - 2ms/step - loss: 0.0326 - recall: 0.8855 - val_loss: 0.0501 - val_recall: 0.8709
Epoch 99/800
140/140 - 0s - 2ms/step - loss: 0.0333 - recall: 0.8855 - val_loss: 0.0501 - val_recall: 0.8709
Epoch 100/800
140/140 - 0s - 2ms/step - loss: 0.0323 - recall: 0.8867 - val_loss: 0.0501 - val_recall: 0.8709
Epoch 101/800
140/140 - 0s - 2ms/step - loss: 0.0327 - recall: 0.8829 - val_loss: 0.0502 - val_recall: 0.8709
Epoch 102/800
140/140 - 0s - 2ms/step - loss: 0.0316 - recall: 0.8842 - val_loss: 0.0502 - val_recall: 0.8709
Epoch 103/800
140/140 - 0s - 2ms/step - loss: 0.0328 - recall: 0.8816 - val_loss: 0.0501 - val_recall: 0.8709
Epoch 104/800
140/140 - 0s - 2ms/step - loss: 0.0326 - recall: 0.8880 - val_loss: 0.0500 - val_recall: 0.8709
Epoch 104: early stopping
Restoring model weights from the end of the best epoch: 89.
In [132]:
#plot model's loss
plot(history, 'loss')
No description has been provided for this image
In [133]:
#plot model's recall
plot(history, 'recall')
No description has been provided for this image
In [134]:
#add model to our results df
results.loc[12] = [
  2, #hidden layers
  [64, 128], #neurons/layer
  ["relu", "relu"], #activation function
  epochs, #epochs
  batch_size, #batch size
  "Adam", #optimizer
  [0.0001, "-"], # learning rate, momentum
  ["Xav", "Xav", "Xav"], #weight initializer
  ["Batch norm", "Dropout (0.2)", "Class weights"], #regularization
  history.history["loss"][-1], #train loss
  history.history["val_loss"][-1], #val loss
  history.history["recall"][-1], #train recall
  history.history["val_recall"][-1], #val recall
  round(end-start,2) #trt
  ]
results
Out[134]:
# hidden layers # neurons - hidden layer activation function - hidden layer # epochs batch size optimizer learning rate, momentum weight initializer regularization train loss validation loss train recall validation recall time (secs)
0 1 64 relu 50 100 SGD [0.001, -] - - 0.051616 0.066721 0.832690 0.804805 14.10
1 2 [64, 128] [relu, relu] 50 100 SGD [0.001, -] [Xav, Xav, Xav] - 0.043756 0.060069 0.858430 0.840841 14.28
2 2 [64, 64] [relu, tanh] 50 100 SGD [0.001, -] [Xav, Xav, Xav] - 0.042877 0.054599 0.848134 0.828829 14.53
3 2 [64, 128] [relu, relu] 50 100 SGD with mom [0.001, 0.9] [Xav, Xav, Xav] - 0.019997 0.058326 0.925354 0.870871 16.27
4 2 [64, 128] [relu, relu] 50 100 SGD with mom [0.0001, 0.9] [Xav, Xav, Xav] - 0.085880 0.092282 0.646075 0.672673 15.04
5 2 [64, 128] [relu, relu] 50 100 Adam [0.001, -] [Xav, Xav, Xav] - 0.007018 0.099477 0.963964 0.867868 15.84
6 2 [64, 128] [relu, relu] 100 100 Adam [0.0001, -] [Xav, Xav, Xav] - 0.022984 0.056675 0.909910 0.861862 31.01
7 2 [64, 128] [relu, relu] 100 100 Adam [0.0001, -] [Xav, Xav, Xav] Dropout (0.2) 0.037275 0.052004 0.870013 0.873874 31.34
8 2 [64, 128] [relu, relu] 100 100 Adam [0.0001, -] [Xav, Xav, Xav] [Batch norm, Dropout (0.2)] 0.033385 0.048827 0.885457 0.873874 35.42
9 2 [64, 128] [relu, relu] 100 100 Adam [0.0001, -] [Xav, Xav, Xav] [Batch norm, Dropout (0.2), Class weights] 0.228398 0.102819 0.935650 0.888889 34.58
10 2 [64, 128] [relu, relu] 100 100 Adam [0.0001, -] [He, He, Xav] [Batch norm, Dropout (0.2), Class weights] 0.273939 0.112600 0.921493 0.885886 34.35
11 2 [64, 128] [relu, relu] 100 100 Adam [0.0001, -] [Xav, Xav, Xav] [L2, L2, Class weights] 0.272880 0.210015 0.948520 0.885886 31.58
12 2 [64, 128] [relu, relu] 800 100 Adam [0.0001, -] [Xav, Xav, Xav] [Batch norm, Dropout (0.2), Class weights] 0.032637 0.050026 0.888031 0.870871 36.30
In [196]:
model_12_train_perf = model_performance_classification(model12, X_train, y_train)
model_12_train_perf
438/438 ━━━━━━━━━━━━━━━━━━━━ 0s 758us/step
Out[196]:
Accuracy Recall Precision F1 Score
0 0.994 0.950186 0.992153 0.970114
In [197]:
model_12_val_perf = model_performance_classification(model12, X_val, y_val)
model_12_val_perf
188/188 ━━━━━━━━━━━━━━━━━━━━ 0s 783us/step
Out[197]:
Accuracy Recall Precision F1 Score
0 0.992167 0.936496 0.987872 0.960572

Model Performance Comparison and Final Model Selection¶

Now, in order to select the final model, we will compare the performances of all the models for the training and validation sets.

In [136]:
results
Out[136]:
# hidden layers # neurons - hidden layer activation function - hidden layer # epochs batch size optimizer learning rate, momentum weight initializer regularization train loss validation loss train recall validation recall time (secs)
0 1 64 relu 50 100 SGD [0.001, -] - - 0.051616 0.066721 0.832690 0.804805 14.10
1 2 [64, 128] [relu, relu] 50 100 SGD [0.001, -] [Xav, Xav, Xav] - 0.043756 0.060069 0.858430 0.840841 14.28
2 2 [64, 64] [relu, tanh] 50 100 SGD [0.001, -] [Xav, Xav, Xav] - 0.042877 0.054599 0.848134 0.828829 14.53
3 2 [64, 128] [relu, relu] 50 100 SGD with mom [0.001, 0.9] [Xav, Xav, Xav] - 0.019997 0.058326 0.925354 0.870871 16.27
4 2 [64, 128] [relu, relu] 50 100 SGD with mom [0.0001, 0.9] [Xav, Xav, Xav] - 0.085880 0.092282 0.646075 0.672673 15.04
5 2 [64, 128] [relu, relu] 50 100 Adam [0.001, -] [Xav, Xav, Xav] - 0.007018 0.099477 0.963964 0.867868 15.84
6 2 [64, 128] [relu, relu] 100 100 Adam [0.0001, -] [Xav, Xav, Xav] - 0.022984 0.056675 0.909910 0.861862 31.01
7 2 [64, 128] [relu, relu] 100 100 Adam [0.0001, -] [Xav, Xav, Xav] Dropout (0.2) 0.037275 0.052004 0.870013 0.873874 31.34
8 2 [64, 128] [relu, relu] 100 100 Adam [0.0001, -] [Xav, Xav, Xav] [Batch norm, Dropout (0.2)] 0.033385 0.048827 0.885457 0.873874 35.42
9 2 [64, 128] [relu, relu] 100 100 Adam [0.0001, -] [Xav, Xav, Xav] [Batch norm, Dropout (0.2), Class weights] 0.228398 0.102819 0.935650 0.888889 34.58
10 2 [64, 128] [relu, relu] 100 100 Adam [0.0001, -] [He, He, Xav] [Batch norm, Dropout (0.2), Class weights] 0.273939 0.112600 0.921493 0.885886 34.35
11 2 [64, 128] [relu, relu] 100 100 Adam [0.0001, -] [Xav, Xav, Xav] [L2, L2, Class weights] 0.272880 0.210015 0.948520 0.885886 31.58
12 2 [64, 128] [relu, relu] 800 100 Adam [0.0001, -] [Xav, Xav, Xav] [Batch norm, Dropout (0.2), Class weights] 0.032637 0.050026 0.888031 0.870871 36.30
  • Validation loss score for Model 8 is the lowest of 0.04 followed by model 12 of 0.05
  • Test loss score is better for model 12 is 0.032 followed by model 8 of 0.33 Since the dataset is an imbalanced dataset we will look at f1 score as well
In [217]:
models_train_comp_df = pd.concat(
    [
      model_0_train_perf.T,
      model_1_train_perf.T,
      model_2_train_perf.T,
      model_3_train_perf.T,
      model_4_train_perf.T,
      model_5_train_perf.T,
      model_6_train_perf.T,
      model_7_train_perf.T,
      model_8_train_perf.T,
      model_9_train_perf.T,
      model_10_train_perf.T,
      model_11_train_perf.T,
      model_12_train_perf.T

    ], axis=1 # Concatenate horizontally
)
models_train_comp_df.columns = [
    "Model 0",
    "Model 1",
    "Model 2",
    "Model 3",
    "Model 4",
    "Model 5",
    "Model 6",
    "Model 7",
    "Model 8",
    "Model 9",
    "Model 10",
    "Model 11",
    "Model 12"
]
print("Train set performance comparison:")
models_train_comp_df
Train set performance comparison:
Out[217]:
Model 0 Model 1 Model 2 Model 3 Model 4 Model 5 Model 6 Model 7 Model 8 Model 9 Model 10 Model 11 Model 12
Accuracy 0.989643 0.991357 0.991214 0.995643 0.978571 0.997786 0.994786 0.994000 0.994071 0.986929 0.985571 0.988286 0.994000
Recall 0.916383 0.929405 0.926301 0.961958 0.822092 0.980657 0.956053 0.951397 0.952041 0.962796 0.962683 0.973811 0.950186
Precision 0.983115 0.987085 0.989027 0.996390 0.966075 0.998200 0.993940 0.990849 0.990895 0.920772 0.911299 0.923886 0.992153
F1 Score 0.946958 0.956197 0.955241 0.978475 0.879351 0.989251 0.974143 0.970190 0.970564 0.940634 0.935249 0.947251 0.970114
In [222]:
models_val_comp_df = pd.concat(
    [
      model_0_val_perf.T,
      model_1_val_perf.T,
      model_2_val_perf.T,
      model_3_val_perf.T,
      model_4_val_perf.T,
      model_5_val_perf.T,
      model_6_val_perf.T,
      model_7_val_perf.T,
      model_8_val_perf.T,
      model_9_val_perf.T,
      model_10_val_perf.T,
      model_11_val_perf.T,
      model_12_val_perf.T

    ], axis=1 # Concatenate horizontally
)
models_val_comp_df.columns = [
    "Model 0",
    "Model 1",
    "Model 2",
    "Model 3",
    "Model 4",
    "Model 5",
    "Model 6",
    "Model 7",
    "Model 8",
    "Model 9",
    "Model 10",
    "Model 11",
    "Model 12"
]
print("Validation set performance comparison:")
models_val_comp_df
Validation set performance comparison:
Out[222]:
Model 0 Model 1 Model 2 Model 3 Model 4 Model 5 Model 6 Model 7 Model 8 Model 9 Model 10 Model 11 Model 12
Accuracy 0.987500 0.990000 0.989667 0.990833 0.978833 0.992000 0.990667 0.991667 0.991500 0.981500 0.982500 0.982333 0.992167
Recall 0.901520 0.919803 0.913973 0.934377 0.834748 0.933581 0.930049 0.936231 0.936143 0.937915 0.937032 0.936943 0.936496
Precision 0.976335 0.983166 0.986120 0.976359 0.953345 0.989319 0.979132 0.982939 0.981316 0.896714 0.904089 0.902966 0.987872
F1 Score 0.935333 0.948977 0.946789 0.954273 0.884007 0.959551 0.953093 0.958244 0.957472 0.916140 0.919816 0.919162 0.960572
In [223]:
models_train_comp_df.loc["F1 Score"] - models_val_comp_df.loc["F1 Score"]
Out[223]:
F1 Score
Model 0 0.011625
Model 1 0.007220
Model 2 0.008452
Model 3 0.024202
Model 4 -0.004657
Model 5 0.029699
Model 6 0.021050
Model 7 0.011946
Model 8 0.013091
Model 9 0.024494
Model 10 0.015433
Model 11 0.028089
Model 12 0.009542

  • Lowest f1 score is for Model 12 hence we will pick model 12 as our model

Now, let's check the performance of the final model on the test set.

In [230]:
y_train_pred = model12.predict(X_train)
cr_train = classification_report(y_train, y_train_pred > 0.5)
print(cr_train)
438/438 ━━━━━━━━━━━━━━━━━━━━ 0s 780us/step
              precision    recall  f1-score   support

           0       0.99      1.00      1.00     13223
           1       0.99      0.90      0.94       777

    accuracy                           0.99     14000
   macro avg       0.99      0.95      0.97     14000
weighted avg       0.99      0.99      0.99     14000

In [231]:
#confusion matrix for training set
cm_train = confusion_matrix(y_train, y_train_pred > 0.5)
sns.heatmap(cm_train, annot = True, fmt = '' )
plt.xlabel("Predicted")
plt.ylabel("True")
plt.show()
No description has been provided for this image
In [232]:
#percentages confusion matrix
sns.heatmap(cm_train/np.sum(cm_train), annot=True, fmt = '.2%')
plt.xlabel("Predicted")
plt.ylabel("True")
plt.show()
No description has been provided for this image
In [234]:
#checking model 12's metrics on the validation set
y_val_pred = model12.predict(X_val)
cr_val = classification_report(y_val, y_val_pred > 0.5)
print(cr_val)
188/188 ━━━━━━━━━━━━━━━━━━━━ 0s 768us/step
              precision    recall  f1-score   support

           0       0.99      1.00      1.00      5667
           1       0.98      0.87      0.93       333

    accuracy                           0.99      6000
   macro avg       0.99      0.94      0.96      6000
weighted avg       0.99      0.99      0.99      6000

In [235]:
#confusion matrix for validation set
cm_val = confusion_matrix(y_val, y_val_pred > 0.5)
sns.heatmap(cm_val, annot = True, fmt = '' )
plt.xlabel("Predicted")
plt.ylabel("True")
plt.show()
No description has been provided for this image
In [236]:
#percentages confusion matrix
sns.heatmap(cm_val/np.sum(cm_val), annot=True, fmt = '.2%')
plt.xlabel("Predicted")
plt.ylabel("True")
plt.show()
No description has been provided for this image
In [237]:
score = model12.evaluate(X_test, y_test)
157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step - loss: 0.0524 - recall: 0.8279  
In [238]:
#checking model 12's metrics on the test set
y_test_pred = model12.predict(X_test)
cr_test = classification_report(y_test, y_test_pred > 0.5)
print(cr_test)
157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 1ms/step  
              precision    recall  f1-score   support

           0       0.99      1.00      1.00      4718
           1       0.98      0.86      0.92       282

    accuracy                           0.99      5000
   macro avg       0.99      0.93      0.96      5000
weighted avg       0.99      0.99      0.99      5000

In [240]:
#confusion matrix for test set
cm_test = confusion_matrix(y_test, y_test_pred > 0.5)
sns.heatmap(cm_test, annot=True, fmt = '')
plt.xlabel("Predicted")
plt.ylabel("True")
plt.show()
No description has been provided for this image
In [241]:
#percentages confusion matrix
sns.heatmap(cm_test/np.sum(cm_test), annot=True, fmt = '.2%')
plt.xlabel("Predicted")
plt.ylabel("True")
plt.show()
No description has been provided for this image
In [229]:
best_model = model12
best_model_test_perf = model_performance_classification(best_model,X_test,y_test)
best_model_test_perf
157/157 ━━━━━━━━━━━━━━━━━━━━ 0s 908us/step
Out[229]:
Accuracy Recall Precision F1 Score
0 0.9912 0.928654 0.987663 0.956011

Actionable Insights and Recommendations¶

Write down some insights and business recommendations based on your observations:

  • Model 12, our final model, achieved a loss score of 0.0523 and a recall score of 0.8279 on the unseen test set
  • The loss is a big improvement from Model 9 (0.1026), which is an early version of Model 12, and is very close to Model 12's loss score on the validation set (0.0500)
  • The recall score is a bit lower than Model 12's validation recall score (0.8708)
  • The classification report shows Model 12 was basically perfect at identifying generators that would not fail, and 86% successful at identifying generators that would fail
  • There is room for improvement in this model, given the difference between the test set recall score and the validation recall score. Further optimizations could be attempted on Model 12 -- including adding more hidden layers, adding different numbers of neurons to those layers, adding and removing Batch Normalization and Dropout layers, testing different Dropout levels, further reducing the learning rate and increasing the number of epochs, and re-introducing He initialization (which didn't seem to help much in Model 10, but perhaps it would be more effective on Model 12).
  • L1 and L2 regularization could also be introduced; we never tried L1 regularization, so there is opportunity there.
  • That said, Model 12 does a great job of minimizing the number of "false negative" generator failure predictions -- only 0.80% of cases the model predicted to not fail, ended up failing (where 0 is "no failure" and 1 is "failure")
  • This means the number of cases resulting in costly replacements is very low ReneWind should maintain an aggressive policy of inspections, since inspections are cheaper than replacements or repairs. In addition to inspecting any generators Model 12 predicts will fail, the company can do spot inspections on a percentage of the generators Model 12 predicts will not fail -- in an effort to catch some of the false negatives that do make it through -- since doing an extra inspection will ultimately be cheaper than doing a replacement, which is what the false negatives will ultimately result in